THE QUARTERLY JOURNAL OF 


MATHEMATICS 


OXFORD SECOND SERIES 


Volumeg  No.35 September 1958 


G. M. Petersen: Matrix norms . ‘ 

G. B. Preston: Matrix representations of semigroups . 

T. G. Room and R. J. Smith: A en of the 
symplectic group . ; 

Prabhu: On the for the finite 


H. B. Shutrick : extensions . 

J. B. McLeod: On the commutator subring 


D. H. Parsons: One-dimensional characteristics of a par- 
tial differential equation of the second order, with ” 
number of independent variables 


A. G. Walker: Connexions for — distributions in 
the large (II) ; 

P. Whittle: A multivariate of Tchebichev’s 
inequality . 


OXFORD 
AT THE CLARENDON PRESS 


Price 16s. net 


PRINTED IN GREAT BRITAIN BY CHARLES BATEY AT THE UNIVERSITY PRESS, OXFORD 


of 

161 

169 

177 

183 

189 

202 

207 

221 

232 


THE QUARTERLY JOURNAL OF 
MATHEMATICS 


OXFORD SECOND SERIES 


Edited by T. W. CHAUNDY, U. S. HASLAM-JONES, 
E. C, THOMPSON 


HE QUARTERLY JOURNAL OF MATHEMATICS 

(OXFORD SECOND SERIES) is published at 16s. net 
for a single number with an annual subscription (for four 
numbers) of 55s. post free. 

Papers, of a length normally not exceeding 20 printed pages 
of the Journal, are invited on subjects of Pure and Applied 
Mathematics, and should be addressed “The Editors, Quarterly 
Journal of Mathematics, Clarendon Press, Oxford’. Authors 
are referred to ‘The Printing of Mathematics’ (Oxford University 
Press, — for detailed advice on the preparation of mathe- 

matical papers for publication. The Editors as a rule will not 
wish to accept material that they cannot see their way to publish 
within a twelvemonth. 

While every care is taken of manuscripts submitted for publi- 
cation, the Publisher and the Editors cannot hold themselves 
responsible for any loss or damage. Authors are advised to 
retain a copy of anything they may send for publication. 
Authors of papers printed in the Quarterly Journal will be 
entitled to 50 free offprints. 

Correspondence on the subject-matter of the Quarterly 
Journal should be addressed, as above, to “The Editors’, at 
the Clarendon Press, All other correspondence should be 
addressed to the publishers: 

OXFORD UNIVERSITY PRESS 
AMEN HOUSE, LONDON, E.C.4 


The publishers are signatories to the Fair Copying Declaration in 
respect of this journal. Details of the Declaration may be obtained 


from the offices of the Royal Society upon application. 


An Elementary Introduction to the 
Methods of Pure Projective Geometry 


J. HEADING, M.A., PH.D. 10s 


The j Operator for Electrical Engineers 


PHILIP KEMP, M.SC.TECH., M.1.E.£. 2Is 


Textbook of Economic Analysis 


EDWARD NEVIN, M.a., PH.D, 18s 


Mathematical Economics 


R. G. D. ALLEN, C.B.E., M.A., D.SC., F.B.A. 63s 


MACMILLAN & CO LTD 


St. Martin’s Street, W.C.2 


By J. W. ARCHBOLD, M.A.(Camb.). 

Here is a new textbook specially written to include those parts 

of algebra which are mentioned in the syllabuses for the 

London University Degrees—B.A. General in Maths., 

B.Sc. General (Revised Regulations) in Pure and Applied Maths., 
B.A. Honours and B.Sc. Special in Maths. (Papers 1 to 6) 


algebra 


| It is the first textbook to cover these topics in one volume as well 
as covering them in the spirit of the current trends in abstract 
algebra. Exercises and recent examination questions, as 
well as notes on solutions, are included in each chapter. 
From booksellers, 45s. net 


PITMAN TECHNICAL BOOKS 
Parker St., Kingsway, London, W.C. 2 


{1 front] 


[2] 


Proceedings of the 
Glasgow Mathematical Association 


Editorial Committee: T. M. MacRopert, R. A. Ranxin, R. P. Gittesrie, T.S. Granam 


Department of Mathematics, The University, Glasgow 
Volume 3. Part 4. July 1958 


J. Futon and I. N. SNeppon. The dynamical stresses produced in a thick plate 
by the action of surface forces. 

K. R. Yacous. A note on semi-special permutations. 

Avuret Winter. On stratifications of Mittag-Leffler’s transcendents. 

W. N. Evenrirr. A note on positive definite matrices. 

C. T. Rasacopat. On Tauberian theorems for Abel-Cesaro summability. 

A. P. Ropertson, On rearrangements of infinite series. 

F. M. Racas. Expansion of an E-function in a series of products of E-functions. 

T. M. MacRoserr. Integrals involving hypergeometric functions and E-functions. 

R. A. Rankin. The construction of branched covering Riemann surfaces. 


The proceedings are published twice yearly, four parts comprising a volume of about 
200 pages. The subscription price per volume is £2 ($6.06), post free, payable in advance. 
Single parts may be supplied at a cost of 10s. 6d. ($1.50), post free. 


Inquiries and subscription orders should be sent to the publishers 
OLIVER and BOYD, Tweeddale Court, Edinburgh, 1 


| 
: 


MATHEMATICAL REVIEWS 


A Journal Containing Reviews of the 
Mathematical Literature of the World, 
with full Subject and Author Indices 


Sponsored by 
Subscriptions accepted to cover the calen- 
The Mathematical Association of America dar year only, Issues appear monthly 
The and Applied except July. $35.00 per year. $12.00 to 
The Institute of Mathematical Statistics members of sponsoring organizations. An 
Edinburgh Society edition printed on one side, for bibliogra- 
: weg phical purposes, is available at an addi- 
M. Fe Kobenhavn 
Het Wiskundig Gusseminas te Amsterdam tional charge of $1.00 per year. Unesco 
The London Mathematica! Society Book Coupons may be used in payment. 
Polskie Towarzystwo Matematycne 
Union Matemdtica Argentina 
Indian Mathematical Society 
Umon ‘Meo Send subscription orders to 


AMERICAN MATHEMATICAL SOCIETY 
190 Hope Street, Providence 6, R.I. 


ON SCIENCE 


MATHEMATICS & 


THE HUMANITIES 


IN ALL LANGUAGES 
* 


Catalogues available free 
books & learned journals bought 
* 


W. HEFFER & Sons, Ltd. 
Petty Cury . Cambridge 


[3] 


HEFFERS 
BOOKS 
| 


A MEMORIAL 


to 


JOHN VON NEUMANN 
has been published by the 
AMERICAN MATHEMATICAL SOCIETY 


in the form of a supplement to the May 1958 issue of the Bulletin 


‘We do not erect statues of great scientists. Instead, the American Mathematical 
Society publishes this volume as a memorial to John von Neumann. Some of his 
friends describe his brilliant mind, his warm personality, his work which will live on 
in mathematics and in the other sciences to which he has contributed so much.’ 


CONTENTS 
John von Neumann, 1903-1957 By S. Ulam 
Von Neumann and Lattice Theory By Garrett Birkhoff 
Theory of Operators, Part |. Single Operators By F. J. Murray 


Theory of Operators, Part Il. Operator Algebras 
By Richard V. Kadison 


Von Neumann on Measure and Ergodic Theory By Paul R. Halmos 


Von Neumann’s Contributions to Quantum Theory 
By Léon Van Hove 


John von Neumann’s Work in the or of Games 
and Mathematical Economics By H. W. Kuhn and A. W. Tucker 


Von Neumann’s Contributions to Automata Theory 
By Claude E. Shannon 


NOTE: At the end of the first article there is a complete bibliography of von Neumann’s publications. 
Valuable bibliographical material also appears elsewhere in the Memorial. 


129 pp. $3.20 


25% discount to members of the Society 


The Memorial is being distributed free of charge to all subscribers to Volume 
64 (1958) of the BULLETIN of the American Mathematical Society. 


Send orders to 
AMERICAN MATHEMATICAL SOCIETY 


190 Hope Street, Providence 6, R. |. 


| 
| 
A\ 
— 
[4] 
| 
{ 
| 


MATRIX NORMS 
By G. M. PETERSEN (Albuquerque, New Mexico) 
{Received 11 February 1957; in revised form 20 November 1957] 


1. Ly this paper we investigate the regular summation matrices 
A= (Ann) 

that have the additional property 

lim max/a,,,,| = 0. 

mo n 
I shall call the class of these matrices UM. In this first section I con- 
sider sequences of such matrices {A*} such that A* sums all bounded 
sequences that are summable A*-! and show that there is a sequence 
of regular iterations {B*}, the matrix B* being equivalent to A* for 
bounded sequences. In the second section, I prove some properties of 
the norm of the matrix. Matrices of class MU have been investigated 
extensively by Lorentz (2). 

A matrix B = (6,,,) is said to be ‘b-stronger than A = (a,,,,)’ if every 
bounded sequence that is A-summable is also B-summable. If B is 
b-stronger than A and A is b-stronger than B, then A and B are 
‘b-equivalent’. The matrix B is a-stronger than A if all sequences 
(bounded and unbounded) that are A-summable are also B-summable; 
if A is also a-stronger than B, the two matrices are ‘a-equivalent’. Two 
matrices are ‘b-consistent’ if every bounded sequence summed by them 
both is summed to the same sum. The following theorem has been proved 
by Brudno (1); see also (5). . 

TueoreM 1. Jf B = (b,,,,) is regular and b-stronger than A = (a,,,), 
then B must be b-consistent with A. 


first prove 


Lemma 1. For every regular matrix A = (a,,,) and every strictly in- 
creasing integral-valued function k(m), there is a matrix A’ = (a},,) that 
is b-equivalent to A and such that 4), 4m) #9, Ann = 0 (n > k(m)). 

Proof. C. Goffman and the author have shown (3) that, if A = (a,,,,,) 
is infinite-rowed, there is a finite-rowed matrix b-equivalent to A, and 
hence we can assume that A is finite-rowed. We may also assume 
that a,, #0, a,, = 0 (n> 1). We can rearrange the rows so that 
the number of the last non-zero element A(m) is a strictly increasing 
function of m. We now form a new matrix A” = (a;,,,) as follows: we 


Quart. J. Math. Oxford (2), 9 (1958), 161-8. 
3695.2.9 M 


| 


162 G. M. PETERSEN 


first repeat the first row of A (i.e. a,,) v, times where k(v,) > A(2), we 
then repeat the second row v, times, where k(v,+-v,) > A(3), and then 
introduce the third row of A and so on. In forming A’ = (a),,), we 
have = @”,, for all n if a), If = 9 Finn = for 
n k(m), kom) = It is evident that A” and A are b-equivalent, 
but so are A’ and A” since, if |s,| < H, 

H 


| > (Ginn | < 


This completes the proof of the lemma. 

The value of the counting function w(n) of a sequence {n,} of integers 
is, for a given n, the number of n, satisfying the inequality n, <n. Let 
Q(n) be @ fixed positive function increasing to +-090 with n. The function 
Q(n) is called a summability function of A = (a,,,,) if 

lim \a,,,| = 9 (ne n,) 

for any sequence {n,} for which w(n) < Q(n). For instance, each func- 
tion Q(n) = o(n)is asummability function of the method (C, 1). Lorentz 
(2) showed that 


THEOREM 2. A regular method A = (a,,,,) belongs to U if and only if 
it has a summability function Q(n). 
We have immediately the lemma: 


Lemma 2. If A = (a,,,) is of the class U and b,,,, = 4,, (n¢m,), 
Onn, = 0(k = 1, 2,...), then A and B = (b,,,) are b-equivalent, the counting 
function of {n,} being dominated by Q(n). 


We now prove 


Lemma 3. Let {A"} (r = 1, 2,...) be a sequence of matrices of type U, 
A’ = such that A” is b-stronger than Let ¥ < M for 
every mand r. There exists a sequence of matrices {B"} such that Br is 
b-equivalent to A’, and Br is a-stronger than B'-!. Also 


Din ddm) F 9, bnn = 9 (n > A,(m)) 


and 
either A(m+1)=A,(m)+1 or A(m+1) = A(m)+(r+1); 
in the latter case OF +1 =. = Di = (i = 


Proof. It is clear from Lemma 1 that we may assume A’ = (a7,,,) to 
be finite-rowed and, if #9, = 0 (n > k,(m)), then 


k,(m) = 


| 
| 
| 


MATRIX NORMS 163 


where w(v), the counting function of {v,,}, satisfies w(v) < Q(v), and Q(r) 
is a summability function of A!. A summability function of A! is a 
summability function of A’ (r = 1, 2). We shall further assume that 


1. 
We construct a matrix C’ = (c%,,,) such that 
Con = When n + for all i, 
Con =O when n = k,(i)—(r+1),..., &(i)—1, for all i. 
By Lemma 2 the matrix C” is b-equivalent to A’. Let CL = > cin&y- 
Then we construct 


as follows. Take m’ so that v,,..;—v,,, > 7; then, for m > m’, Bi, = Ch, 
if p = v,,—(m—m’')r; if v,,—(m—m')r < —(m+1—m’)r, then 


B= (1-5) % , wherei = p—v,,+-(m—m’)r; for p < m’, 
let BY, = 8,. 


It is clear that B’ is regular. Moreover, if C’ does not sum a sequence, 
{Br-on—mr} diverges, so that, if it is B’-summable, a sequence must 
also be Cr-summable. If a bounded sequence is Cr-summable to s, it is 
also summable C+! to s (i > 1), by Theorem 1. Suppose that |s',| < H 
for all n and > \ch,,| < M’ for all m and r. Then choose j such that 
HM'/j+1 <e; for0 <i <jthere is an m” such that max | C7+'—s| < 


0<i<j 
for m > m". Thus, if 


! l 
then |Bi—s| < = 
But, if 
Vn —(M—M' (m > mM"), 
then 


< < 2. 


Hence {s’,} is B’-summable, and B’ and C’ are b-equivalent. An un- 
bounded sequence not be B’-summable if it is not C’+*-summable 
for every i > 0 to the same sum. In fact, there must be an {e,,} such 


164 G. M. PETERSEN 

that < ¢,, forall0 <i < v,,,.,—v,—rand lime,, = 0. 
m2 

For, if there were a sequence {i(m)} such that 


>, 
then clearly {B/,} would diverge, where = v,,—(m—m’')r+-i(m). How- 
ever, if {e,,} exists, for 
Vip —(m—m' pp < — (m+ 1—m’)r, 
1 
then |Bi—s| < (1— ten Pen, 
and { B7,} converges. On the other hand, if {s,,} is an unbounded sequence 
that is B’-summable, so that 


< Em 


for all 0 <i < »,,,,—v,,—r, then it must follow that 


when 0 < i—1 < 
This means that, for a suitable m”, 


—(m—m" < < 1) (m > m"), 


1 
and {s,} is Brt!-summable. Hence B+! is a-stronger than B" and 
b-equivalent to 
We also have k,.,(m) = k,(m)+-1 so that, if 9, 


=O (n>Am)), =Aw)+1 
whenever —(m—m')r < wp < —(M+-1—m’)r—1 
or when p < m’. Also 

= 1 +97] = 
and (m+ 1—m' = +7, 
so that 
(m+ 1—m' — (m+ 1—m’ = r+. 

However, 


Cy, = 0 when k,(m)—(r+1) <n < k,(m)—1, 
i.e. , =90 for all i. 


| | i 1 a 
< ey < 
| 
| 
| 
| 
| 


MATRIX NORMS 
Hence, if A(u+1)—A(u) = r+1, then 
= —(m+1—m’)r—1, An) = 
and ( = 0 
for all and <n <v,,+(r—1), ie. for A(u)+1 <n <A(p)+r. 
This completes the proof of the lemma. 

The iteration of a matrix B = (b,,,) with a matrix A = (a,,,) is 
defined by t, = ¥ where = ¥ In short, the iteration 
of B with A or B.A consists in applying the matrix (b,,,,,) to the sequence 
{7,,} of A transforms of {s,,}. 

We now prove 

Lemma 4. There exists a sequence of regular matrices {D"} such that 
= B for every r. 

Proof. To find the matrix D” = (dj,,,) we must solve the equations 


Aim) 
i=1 


We have b75' = 0 for all m and n =»,+1, »,+2,..., +(r—2),..., 
vp tl,..., y%+(r—2), where v, is the last member of the sequence {v,,} 
satisfying v, < A(m); however, b7,,, = 0 for these values of n also. Hence 
we need not consider these columns, and the coefficients of the remaining 
d’, form a non-zero triangular determinant since 
Ap-a(m+1) = A,4(m)+1 vg) 
or, if m+1 = »,+1, 
A, -s(m-+1) = 

This means that we can solve for the corresponding d7,,. The matrix 
is regular since { converges whenever { converges. This 
completes our proof. 

We collect the preceding lemmas in a theorem. 

TueoreM 3. Let {A"} (r = 1,2,...) be a sequence of matrices of type 
U, A” = (a’,,), such that A’ is b-stronger than Let ¥ \a’,,| <M 
for every mand r. There exists a sequence of matrices {|B}, ¥ \binn| <M, 
such that Br is b-equivalent to A’ and Bt = B’ where = is 
a regular matrix. 

2. The norm A(A) of a matrix has been defined by Brudno (1) as 

h(A) = sup > 
The method .° has a norm ||.0/|| given by ||.0/|| = infh(A), where the ‘inf’ 
is taken over all the matrix methods equivalent to « for bounded 


166 G. M. PETERSEN 


sequences. The norm has the property that, if every #-summable 
bounded sequence is also ./-summable, i.e. ./ is b-stronger than JZ, 
then > Brudno constructed a sequence of methods {.7/,' 
such that ./,,,, is b-stronger than .o/, for every k and such that 


,|| = 00. 
knw 


The matrix A, is defined by the transformation ¢,, = 2s8,,, —82,,,,, and A, 
by & iterations of the method. For each k there is a sequence of 1’s 
and (—1)’s that A, sums to 3’, so that any matrix HZ = (e,,,,) that is 
b-stronger than A, must have 


lim inf |e,,,| > 3*. 
Since this would be true for all k, no regular method can be b-stronger 
than this sequence of matrices [see also (6)|. To show that such a 
sequence of 1’s and (—1)’s exists, we observe that in each column of 
A, there is at most one non-zero element. Hence the same will be true 
of each matrix A,. If lim > \a*,,| exists, therefore a sequence of 1’s 
and (—1)’s exists that is summed to lim } |a*,,,|. However, for all m, 
= 2(241)4+1(241) = (241)? = 32. 
Likewise > |ak,,,| = (2+1)* = 3. 
I now wish to modify the definition of the norm of a matrix to 
h'(A) = limsup |a,,,|. 
Lemma 5. The norm of &f is defined equivalently by \\.o/\| = infh(A) 
or = infh’(A). 
Proof. In the first place it is clear that for any matrix A = (a,,,,) we 
can select those rows {m,} such that 
= K, > U =h’‘(A), 
and define a new matrix B = (b,,,,) by 


However, A and B are b-equivalent: for let 


U 


Then t,—f,=0 (m£m,), < 


ie. lim |t,—bn| = 0. Hence infh(A) = infh’(A) and our statement is 
proved. 


| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
Age 
Se 


MATRIX NORMS 167 


THEOREM 4. There is a sequence of matrices {B,} such that h'(B,) = 1 
for every k, By... is a-stronger than B,,, and no regular matrix is a-stronger 
than every B,.. 


Proof. Consider the successive iterations of the matrices A; = (a‘,,) 
of Brudno’s example with the Cesaro matrix C = (c,,,,), where 


Cnn = m-* (n <™m), Can = 0 (n> m). 


Denote the matrix A,.C by B, = (b%,,); then B,,, is a-stronger than 
B,. We have ae 
= (A, = 2”), 


the terms of the mth row of A, being zero outside the limits of summa- 
tion. Now consider the sequence {s;}, 


8 = <i < 24(m+1)), 
n remaining fixed. Since 
2km 


we have lims; = | for fixed k. But A, is regular, and this means that 


Am+1 
lim ak = 1. 
Hence b*,,, > 0 for large m. Since c;, = i-1 for all n, this holds uni- 
formly and, for some m, b*,,, > 0 for all n unless some c,,, in the sum 
are zero, i.e. for n < m—2*. In these cases we have 


2km 


Thus the sum of the last 2* terms is O(m-"), and the matrix is essentially 
a positive matrix, so that h’(B,) = 1 for all k. Suppose that a matrix 
B is a-stronger than every B,; since C is a triangular matrix, we can 
find a matrix D = (d,,,) such that B= D.C, and D would be a- 
stronger and hence b-stronger than A, for every k. However, this is 
a contradiction, and so the theorem is proved. 

For bounded sequences, if t,, = then > 0 for 
every fixed k. However, the A, methods do not sum bounded sequences 
with this property, and so we have that the matrices described in the 
theorem are all b-equivalent to C. The matrix C, therefore, is b-stronger 
than this set of matrices. 


Amst k gk 


le 
ks 
a 
of q 
e 
L, 

3 


168 MATRIX NORMS 


I want to express my thanks to Dr. M. A. Tropper of Queen Mary 
College, London, who kindly lent me her translation of Brudno’s 


paper. 


REFERENCES 


1, A. Brudno, ‘Summation of bounded sequences by matrices’ (in Russian), 
Rec. Math. (Mat. Sbornik) N.S. 16 (1945) 191-247. 

2. G. G. Lorentz, ‘A contribution to the theory of divergent series’, Acta Math. 
80 (1948) 167-90. 

3. C. Goffman and G. M. Petersen, ‘Submethods of regular matrix summability 
methods’, Canadian J. of Math. 8 (1956) 40-46. 

4. G. M. Petersen, ‘Sequences of iterations’, Math. Z. 68 (1957) 151-2. 


5. —— ‘Summability methods and bounded sequences’, J. London Math. Soc. 
31 (1956) 324-6. 
6. ——— ‘The norm of iterations of regular matrices’, Proc. Cambridge Phil. Soc. 


53 (1957) 286-9. 


q 
4 | | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 


MATRIX REPRESENTATIONS OF 
SEMIGROUPS 


By G. B. PRESTON (New Orleans) 
[Received 4 June 1957] 


Let S be a semigroup with an identity element and let 

A= {(a,b): a,be S; Sac Sb}, 

p = {(a,b): a,be S; aS c bS}. 
Let = ANA" and pNp-. Then, in the notation of D. D. 
Miller and A. H. Clifford (1), as shown by J. A. Green (2), 


LoR=RBoL= GQ. 


In a recent paper (3) M. P. Schiitzenberger introduced the concept of 
a Y-class of finite type: the D-class D is of finite typet if 
ANA=pNLH=ANL 
for elements of D. Schiitzenberger then showed that, if S is any semi- 
group with an identity containing a Z-class D, say, of finite type, then 
S admits representations as a semigroup of matrices with entries from 
a group with zero determined by D. If D is both of finite type and also 
regular, the matrices in these representations corresponding to the 
elements of D are closely related to the matrices constructed by Miller 
and Clifford (1) to give a partial representation of a regular Z-class. 
In certain important classes of semigroups every Z-class is of finite 
type. The relation A determines a partial ordering of the #-classes in 
S: if L and L’ are Y-classes, L < L’ if 1Al’ for some le L, l’'e L’. 
Similarly p determines a partial ordering of the #-classes which I also 
denote by <. Then it can be shown that the Z-class D is of finite type 
if and only if each #-class in D and also each &-class in D is a minimal © 
element relative to these partial orderings in the set of all #-classes and 
&-classes respectively in D. Hence, from J. A. Green’s Theorem 8 (2), 
that every Z-class is of finite type in a semigroup satisfying Green's 
minimal conditions for the relations ‘<’, we see that any set of 
L-classes or Z-classes respectively of S contains a minimal element. 
On the other hand a regular Z-class is not necessarily of finite type. 
For example let S be the set of all single-valued mappings of the set J 
of all integers into itself with multiplication defined thus: for a, b € S, 
+ ‘de type élémentaire’. 


Quart. J. Math. Oxford (2), 9 (1958), 169-76. 


170 G. B. PRESTON 


ab is the mapping a followed by the mapping 6. Then it is easy to 
verify that each Z-class consists of all elements of S which determine 
mappings of J with image sets of a given cardinal. The semigroup S 
is regular, so that each Z-class is also regular. Let D be the 7-class 
consisting of all mappings with image sets which are not finite. Each 
L-class in D consists of all mappings with image sets which are a given 
subset of J. Let L be the #-class determined by the image set /; then, 
if L’ is any other #-class in D, L’ # L and L’ < L, and consequently 
D is not of finite type. 

I show in this note that, by a slight modification of Schiitzenberger’s 
representation theory, the restriction that D be of finite type may be 
removed. Any semigroup S has representations by matrices with 
entries from a group with zero determined by any one of its Z-classes. 
In the final section we consider the direct sum of the representations 
determined by the Z-classes of a semigroup S and show that, if S is 
regular, then this direct sum gives a faithful representation of S. 


1. The group determined by a Z-class 

We assume throughout that S contains an identity element. This 
assumption in fact imposes no restriction upon the validity of the 
results [see (2)] although, without the assumption, the results will need 
rephrasing. 

Denote by L, the #-class and by R, the #-class containing x. 
Then we have the following fundamental lemma due to Green [(2) 165]: 


Lemma 1 (GREEN). Let a#b and let b = as (such an element se S 
necessarily exists). Then there exists an element s' in S such that the right 
translations 

ps) (xe L,), 

y> ys’ (ye Ly) 
are mutually inverse, one-to-one, A-class-preserving mappings of L,, onto 
L, and of L,, onto L,, respectively. 


Using this lemma Schiitzenberger shows that each Z-class determines 
a group. Denote the equivalence by #. 


Lemma 2 (ScHiTzENBERGER). Let H be an #-class of S. Let 


T = {t: te 8S; Ht = H}, 
= {(t,,t): ht, = ht, for all h H}. 
Then + is a congruence over the semigroup T and T/r is a group T(H) 


| 
| 
| 


MATRIX REPRESENTATIONS OF SEMIGROUPS 171 


of the same cardinal as H. If K is any #-class in the same Z-class as 
H, then 1(K) is isomorphic to T(H). 


T’ = {t: te S; tH = My, 


Then 7’ is a congruence over the semigroup T’ and T"/r' is a group T’(H) 
isomorphic to T(H). 


I'(H) can be regarded as the group of all (1, 1)-mappings of H onto H 
determined by right multiplications by elements of S. Schiitzenberger 
shows, as it is easy to verify, that any such mapping is determined by 
its effect on any element of H, so that, if we choose a fixed element 
hy in H, then the mapping h — At can be denoted unambiguously by 
d(hyt). It follows from Lemma 1, since any two elements of H are 
A-equivalent, that for any A in H there is an element ¢(h) in ['(/7) that 
maps /, into h. The multiplication of the elements of [(H) is deter- 
mined by the rule d(hot)d(hos) = d(hyst). Effectively another multi- 
plication has been defined in H and under this multiplication H is a 
group. When H is itself a group under the multiplication of S, then 
we may choose h, as the identity of H, and then ['(H) may be identified 
with H. 

I complete this section by giving short proofs of two well-known 
results on #-classes. The first theorem is Green’s Theorem 7 in (2). 


THEOREM (GREEN). Let H be an #-class of a semigroup S such that 
h,h, © H for some h,, hy in H. Then H is a group. 


Proof. Since h, is A-equivalent to h,h, and both h, and hh, lene 
to the same #-class, by Lemma 1, Hh, = H. Hence for any A in H, 
hh, € H, and so, by the left-right dual of Lemma 1, hH = H for any h 
in H. Similarly Hh = H for any h in H; and so H is a group. 

The next theorem is Theorem 3 of Miller and Clifford (1). 


THroreM (MILLER and CiirForD). Let a and b be elements of a semi- 
group S. Thenabe R, L, if and only if R, 9 L, contains an idempotent 
element; if this be the case, then 

aH, = H,b = H, = Hy = Ly. 

Proof. Let abe R, 1 L,. By Lemma 1, since a #ab, p, is an @-class- 
preserving (1, 1)-mapping of L, onto L, with an inverse p,. say. Hence ~ 
in particular 66’6 = 6, so that 5b’ is an idempotent and bb’ e« R,O L,, 
and further H,b6 = H, = Ly. 

The left-right dual of Lemma | similarly implies that aH, = H,». 


LO 
38 
h 7’ = {(t,,t): =t,h for all he H}. 
n 
n, 
ly 
4 
h 
q 
is 
is 
e 
d 
|: 
it 
) q 


172 G. B. PRESTON 


It now follows by an argument similar to that used to prove the previous 
theorem that H, H, = H,». 

Conversely, suppose that e? = ee R,1 L,. Then e is a left identity 
for R, [(1) Lemma 4] and so e#eb = b. Hence, by Lemma 1, p, is an 
&-class-preserving (1, 1)-mapping of LZ, onto L,, so that in particular 
abe Ly. 


2. The representation theorem 

Let D be any Y-class of S and let L,( « € K) and R; (i ¢ J) denote 
the #-classes and #-classes, respectively, of D. We may assume that 
Kn J contains the symbol 1. Then it follows immediately from Lemma 
1 and its left-right dual [(2) Theorem 1] that there exist q,, q). (« € K) 
and r;, r, ((¢ I) belonging to S such that the mappings x > aq, and 
y > yq. are mutually inverse #-class-preserving (1, 1)-mappings be- 
tween L, and L,, and the mappings x > r,;x and y > ry are mutually 
inverse /-class-preserving (1,1)-mappings between R, and R,;. Let 
H = R,/ L, and as in the previous section select a fixed but arbitrary 
hy in H. Then with each s ¢ S we associate a K x K matrix M(s) with 
entries from the group with zero G(H) = T'(H) U {0}, defined as follows: 
M(s) = {m*(s)} where 

mi(8) (hod 8 R, n 
0 (otherwise). 
Similarly we define the J x J matrix M’(s) = {mi(s)}, where 
mi(s) = ho) (8rjho eR, Ly), 
0 (otherwise), 
where ¢'(h) denotes the element of I'’(H) determined by the mapping 
of h, onto h. These definitions are only a slight modification of those 
of Schiitzenberger, but the modification enables us to drop all restric- 
tions upon D. 

THEOREM |. The mappings s > M(s) and s > M'(s) are homomorphic 
mappings of S onto semigroups of matrices, where the matrices are multi- 
plied by usual matrix multiplication and 0 is regarded as an additive zero 
in the computation of matrix products. 


Proof. We prove that s > M(s) is a homomorphic mapping. The re- 
presentation s-—> M’(s) is the left-right dual of the representation 
s—> M(s). 

The element in the («,)-th place of the product of the matrices 
M(s)M (t) is the formal sum 


(0). 


| 
4 
‘ 
| 
| 
| 
| 
|. 
F 
| 


MATRIX REPRESENTATIONS OF SEMIGROUPS 173 


This formal sum will make sense provided that at most one summand 
is non-zero, and this is true because m’(s) 0 only if R,N L, 
and, for a given «, this holds for at most one v. 

Suppose firstly that m%(s) = 0 for all v, so that hog,s ¢R,. Then 
hoq, st € Ry; for ho, hog, st € R, implies that there exists u in S such that 
hog, 8(tu) = hg. Since also hyq,8 = ho(q,s), we have hs Ahyq,s8, and 
hod, 8 © Ry, which is a contradiction. Hence m'(s) = 0 for all v implies 
that m'(st) = 0 for all v, and so in this case 


= meet). 


Secondly, suppose that m2%s) #40, so that Ayg,seR,OL,. If 
mi(t) ~ 0, so that hog,te R,O L,, then, by Lemma 1, the mapping p;, 
is an #-class-preserving (1,1)-mapping of L, upon L,, and so, since 
hod 8 © therefore ste L,. Hence 

= 8tq,,) = mi (st) 
since right multiplication by gi,q, is the identity-mapping on L, to 
which h,q, 8 belongs. A similar argument now shows that, if m#(t) = 0, 
then mé(st) = 0, for m(st) ~ 0 implies that the mapping p, is an 
A-class-preserving (1,1)-mapping of L, upon L,. Hence in all cases 
when m°%(s) 4 0, for some o, then 
me(s)my(t) = mest). 


This completes the proof of the theorem. 

It is easily seen, as for the representation defined by Schiitzenberger, 
that a replacement of hy by some other element of H leaves the above 
representation unchanged. A replacement of the q,, qj. by t,, t, say, 
transforms the matrix M(s) to the matrix AM(s)A-'!, where 


A=ding(a,), = diag(az*) 
and where a, = d(Aot,q).). A similar transformation of the M’(s) 
results on a new choice of r;, 7}. 

Further suppose that we replace H by H;, = R;O L,. Choose as a 
fixed element in H,, the element (h;,.)o = T;/99,- Then, if the elements 
of T'(/#/,,) are denoted by ¢;,(h;,), the mapping 

bh) > 
is an isomorphism between and 

Choose as right multipliers which effect one-to-one #-class-preserving 

mappings between L, and L, the elements p, = q/.q, and pi, = que 


US 
ity 
an 
lar 
te 
na 
K) 
nd 
ye - A 
ly 
et 
ry 
th 
8: 
g 


174 G. B. PRESTON 


With this choice let M/,,(s) be the matrix representation of S determined 
by H,,, the p,, p;, and the group T'(H;,). Then it can be verified that 
M,,{s) = {M(s)}0, where by {M(s)}@ we mean the matrix {m’,(s)6}. 

These comments, and the analogous comments about M'(s), show 
that the representations M(s) and M’‘(s) of S are determined to within 
isomorphism by the Z-class D to which H belongs. 


3. A faithful representation of a regular semigroup 

The direct sum of a set of matrix representations of S is again a 
representation of S, and the question arises: when is the direct sum of 
all the D-class matrix representations of a semigroup S a faithful repre- 
sentation of S? Corresponding to each Z-class in S choose representa- 
tions M(s) and M’(s) of S as in the previous section. Denote by Y(s) 
the direct sum of all the representations M(s), one corresponding to 
each Z-class in S; similarly denote by Y’(s) the direct sum of all the 
M'(s); and denote by [7+ ’](s) the direct sum of and 
I give in the following lemma necessary and sufficient conditions for 
each of the representations 7(s), Z’(s), and [7-+-7’](s) to be faithful. 


Lemma 3. Let 
Q = {(s,t): 8, te S; xs or xt R, implies xs = xt}. 
Let Q’ = {(8,t): 8, te S; sx or tu e L, implies sx = tx}. 


Then Q and Q’ are congruences on S and Ys), Z'(s), [Z+Z'|(s) are 
faithful representations of S if and only if, respectively, Q, Q’, Q1 Q' are 
equal to Ag the identical congruence on 8. 

Proof. Suppose that, for some s, ¢t in S, P(s) = Ait), so that 
M(s) = M(t) for each Z-class in S. Let xe S and suppose that x 
belongs to the Y-class D, Let D be decomposed into its #, #, and # 
classes as in § 2. Suppose that xe H,, = R,;9 L,. Since M(s) = M(t), 
therefore, in the notation of the previous section, M(s)@ = M(t)@. 
Hence, if xs ¢ R,, so that xs ¢ R;N L, for some py, then 

(indo Te 8 (hilo Pn 8 R; n Lys 
and M(s)6 = M(t)@ implies that 

(hilo Prt R; n L, 

and that Pink indo Px Pick Pr 
Hence, if zse R,, then p,sp,, and p,tp, determine the same right 
multiplication of H;,, and so xp,sp;,= xp,tp’,. This implies that 
<P,8P,, P, = =p, tp, p,: that is xp,s = xp,t. But, by the choice of p,, 
P, is a right identity for H,,.. Hence, finally, we have that xs = zt. 


i 

| 

| 

— 


MATRIX REPRESENTATIONS OF SEMIGROUPS 175 


Similarly, if zt ¢ R,, then zt = xs. This argument holds for any z in S; 
and so (s,t)€Q. 

Conversely, an examination of the above argument shows that it 
can be reversed and that (s,t) €¢ Q implies that Z(s) = F(t). 

Hence, since {Y(s)} is a homomorphic image of S, Q is a congruence 
on S, and S/Q is isomorphic to {(s)}. 

Similarly we prove that S/Q’ is isomorphic to {9’(s)} and that 
S/(Q9 Q’) is isomorphic to {{7+’]|(s)}. The remaining assertions in 
the lemma then follow immediately. 

As a corollary we have the following theorem: 


TueoreM 2. Let S be a regular semigroup. Then the representation 
is a faithful representation of 8. 


Proof. Let (s,t)€ QQ’. We have to show that s = ¢. 

Since S is regular, s and t have inverses [see (1) or (2) for the definition 
of a regular semigroup]. Thus there exist elements x and y belonging 
to S such that 

srs = 8, tyt=t, yly=—y. 
Then, since rsx = x, we have that xse R,, and so, since (s,t) € Q, 
xs = at. Again yty = y implies yt = ys. Similarly, since (s,t) € Q’, 
therefore tr = sx and ty = sy. Hence 
= s(xs) = s(xt) = ax(tyt) = (szt)yt = syt = (sy)t = tyt = 
Thus Q Q’ = Ag and the theorem follows from Lemma 3. 

Neither the representation Z(s) nor the representation 9’(s) is in 
general a faithful representation of a regular semigroup, as is shown 
by considering (4) the rectangular band B = {e,f,g,h} of four idem- 
potents. B is regular but neither (s) nor #'(s) is a faithful representa- 
tion of B. 

Professor A. H. Clifford pointed out to me that, when S is an inverse 
semigroup, then the representations Z(s) and Y’(s) are each faithful 
representations of S. To see this consider again a regular semigroup S, 
let (st) ¢ Q, and let x and y be any inverses of s and ¢ respectively. 
Then, as in the proof of Theorem 2, xs = 2t and yt = ys. Hence 


8 = 8(x8) = s(xt) = sx(tyt) = s(xt)(yt) = s(xs)(ys) = sys, 
ysy = y(ty) = y. 
Thus y is an inverse of s. Similarly z is an inverse of t. Thus s and ¢ 
are elements of S with the same set of elements of S as inverses. It 
follows therefore that, if S is an inverse semigroup, i.e. a regular semi- 


1ed 
nat 
ow 
nin 

of 
re- 
a- 
(s) 

to 
he 
$). j 
or 
re 
re 
), 
4 


176 MATRIX REPRESENTATIONS OF SEMIGROUPS 


group in which each element has a unique inverse (5), then (s,t)eQ 
implies that s = ¢. Similarly it follows that Q’ = Ag when S is an 
inverse semigroup. 

Finally, I give an example of a semigroup which is not regular but 
which is faithfully represented by the representation Z(s). Let 7 be 
the right simple semigroup (that is, a semigroup with no proper right 
ideals) consisting of all (1, 1)-mappings « of the set J of all integers into 
I such that J—a(J) is an infinite set. Here a(/) denotes the image of / 
under «a. This is the semigroup introduced by R. Baer and F. Levi (6). 
The product af of two elements a, 8 in T is taken to be the mapping « 
followed by the mapping 8. Adjoin to 7’ the identity element | to form 
the semigroup S. Then I shall show that over the semigroup S the 
congruence Q coincides with Ag. 

For let (s,t) € Q, so that for any 2 € S xs or xt € R, implies xs = at. 
If s = 1, then in particular xs € R, if x = 1, and in this case 

$= = = i, 

so that s = ¢. Similarly, ift = 1, then s = 1. Suppose now that s and ¢ 
both belong to 7’. Since 7 is right simple, 7 is an #-class of S, and 
so xs€ R, for all xin 7. Hence xs = zt for all x in JT. Suppose that 
s #t. Then there is an element n of J such that s(n) ¢ t(n). Let x be 
any element of 7' for which n = x(n); since 7' was chosen as the set of 
all (1,1)-mappings a of J into J such that /—a(J) is infinite, there 
certainly exist mappings x in 7 with this property. Then, for such an 2, 
xs ~ xt since xs(n) ~ xt(n). This is a contradiction, and so s = t. 

This completes the proof that the representation 7(s) is a faithful 


representation of S. It is clear that S is not regular, for the Z-class 7’ 
contains no idempotents. 


Acknowledgement. I must thank Dr. M. P. Schiitzenberger for making 
available to me a copy of his paper (3) before publication. 


REFERENCES 
D. D. Miller and A. H. Clifford, ‘Regular 2-classes in semigroups’, Trans. 
American Math. Soc. 82 (1956) 270-80. 


J. A. Green, ‘On the structure of semigroups’ Ann. Math. 54 (1951) 163-72. 
. M. P. Schiitzenberger, ‘9 représentation des demi-groupes’, C. R. Acad. Sci. 
244 (1957) 1994-6. 


A. H. Clifford, ‘Bands of semigroups’, Proc. American Math. Soc. 5 (1954) 499- 
504. 


W. D. Munn and R. Penrose, ‘A note on inverse semigroups’, Proc. Cambridge 
Phil. Soc. 51 (1955) 396-99. 


- R. Baer and F. Levi, ‘Vollstindige irreducibele Systeme von Gruppen- 
axiomen’, S. B. Heidelberg. Akad. Wiss. 18 (1932) 7. 


| 
| 


A GENERATION OF THE SYMPLECTIC GROUP 
By T. G. ROOM (Sydney) and R. J. SMITH (Kingston, Ontario) 
[Received 14 June 1957] 


In this paper the symplectic group L,,,(@) is defined as the group of 
matrices A of 2m rows and columns for which 


ATGA =G, 

where 


The elements of the matrices may belong to any field of characteristic 
not equal to 2. If the elements belong to GF(p), then the group is 
isomorphic with Dickson’s SA(2m, p) [Dickson(1) 91, Frucht (2)]. 

We were led to the investigation of the generators of this group in 
the course of work on the geometrical loci invariant under certain 
groups of collineations associated with the generalized Clifford units. 
The particular form of the matrix @ arises naturally in this work, and 
the matrices which are introduced as generators have their origin in 
certain substitution operations among sets of Clifford units. 


is the canonical skew matrix, then 
S = 
where 
0 1 —1 0 —1 0 —1 O 
0 0 1 1 1 
0 0 0 1 —1 0 —1 


so that generators of L,,,(S) can be derived directly from those of 
Lon(@). 
We use = {0,0,..., 1,...,0} = 1,..., 2m), 


Quart. J. Math. Oxford (2), 9 (1958), 177-82. 
3695.2.9 N 


S an 

but 

"be 

ight 

into 

of 

ng 

orm 

at. | 
id ¢ 

und 

hat 

be 

| 

ere 

ful 

\T 

ng 

ns. 

72. 

ci. 

ge 

n- 


178 T. G. ROOM AND R. J. SMITH 
with 1 in the rth place, for the basis for vectors of 2m components, 
and the following symbols for certain vectors of k components 
o, = {0, 0,..., O}, 
v, = {1, 1,..., 1}, 
= 
The paper is devoted to the proof of the following theorem: 


TueoreM I. Every element of L.,(G) can be expressed as a finite 
product in which each term is either Q or a matriz-linear function D* for 
some value of the scalar variable x, where 

It is easily verified that Q and D* belong to L,,,(G@). We have 
Qem+1 — 
and, if D is written for D', then, for every integer xz, D* is the 2th 
power of D. 

When the elements of the matrices belong to GF(p), so that L,,,(@) 
is isomorphic with SA(2m,p), then the matrix-linear function D* is 
always a power of D, and 

De = I. 
Theorem I then takes the form: 

THEOREM II. SA(2m,p) is isomorphic with the group generated by the 

two matrices Q and D, where 


From Q and D* we derive the matrices 
WVam-r 
0 Lom 
Pz = 
= (r = 1,..., 2m—1), 


Pin = 


1 


Again, writing P, for P!, we have, for any integer x, P% is the xth 
power of P.. 


‘i 


| 
| 
| 
| 
| 
i 
| 
| : 
i 
| 
| 


A GENERATION OF THE SYMPLECTIC GROUP 179 
For any vector a, 
= 
= (7 = 1,..., 2m—1). 
In relation to any given vector a for which 


where 


we define P* as 
= P? for the value z = 
We have also, when a, = 0, 
Almost the whole of the reduction of a matrix A to a product of terms 


Q and D* is performed with the use of the matrices P* and P.. We 
proceed first to find a matrix R, such that 


€,, 
a being supposed to be the first column of the matrix A. 
(ia) Suppose that 
#0, for all s = 1,...,r—1, 
t=0 


while 


Because the sum of the last two non-zero components is zero, we cannot 
proceed further with this type of reduction. 


(ib) Suppose that 
Xom—pt%m—s+1 = 9 for all s = 1,...,r—1, 


while 
so that = (044, — —p 43 
Then 


and we can proceed with the reduction of 


180 T. G. ROOM AND R. J. SMITH 


using only matrices from the set Pf,..., P3,—--;, none of which affects 
the remaining r components of the vector. 

(ii) The result of a sequence of operations of the forms (ia) and (ib) 
is to reduce a to the form 

Use operations P, to move all zeros to the right. Then further applica- 
tions of the operations P* in relation to pairs of successive terms whose 
sum is not zero, and of operations P, to move zeros to the right, will 
result in the reduction of a to 
Y = (YO), 

where y + 0, since A is not singular. 

The reduction from this form presents some unexpected complica- 
tions. In view of its application at a later stage we effect first the 


reduction in the case in which y = —1 and n is odd. 
(iii) Let N= +1» 
Then 


Pay = 2, —1, 
P} P} Pi eee Pi, Poy = (2, —1, 
We can now move all zeros to the right by operations P, and so obtain 
= (2, —1, 
Finally = &. 
It is to be noted that this operation cannot be carried out if the field is 


of characteristic 2, and for that reason the theorem cannot be proved 
(at least in this form) for such fields. 


(iv) Return now to the general case 
Y = (Y,,, 
in which y is unrestricted, and suppose first that n is even. Then 

If y = —1, this vector can be reduced as in (iii). 

(v) Suppose then that a has been reduced to 

Y = 

where y+1 £0. 
PY = (725-2, —y—1, 1, 


= PT PE... = (—y—1, O2,-2, 1, 


Finally *(y") = 1, 
and this can be reduced by operations P., to €,. 


| 
| 
| 
| 
4 
| 
| 
| 
| 
3 
| : 
| i 
| 4 
| 
| 
| 
| 


A GENERATION OF THE SYMPLECTIC GROUP 181 


Thus, whatever the first column of A may be, we can find a matrix 
R,, a product of matrices P? and D? and therefore of matrices Q and D*, 
such that A, = R, A is a matrix With first column €,. 

(vi) Suppose that matrices R,, R,,..., Ry, (2 << k < 2m—1) have 
been found such that 

We wish to find a matrix R,, a product of matrices Q@ and D*, such 
that the first k columns of R, A,_, are €,..., €,-;, €,- We find in fact 
R,, as a product of terms P%,..., PZ,, (for various values of x); each of 
these clearly leaves €,,..., €,_, unchanged. 
we must have 


e? Gx = ef Gx = ... = €f_, Gu = 1, 
and therefore 


= 
i.e. Ky = Kg tks = = Kpigtky_, = 9, 
say = 1,...,k—1). 


The first step in the reduction is to obtain zeros for the first k—1 
terms: we have 

We may now reduce x’ by steps corresponding to (i) and (ii) to the 
x” = (1, —K-—p+1)- 
These steps involve the use only of operations from the set Pj, Pj,,,..., 
P},,-, for various values of x. 

Applying the condition that x” is the kth column of a matrix belong- 
ing to L,,,(@) whose first kK—1 columns are €),..., €,_,, we find that 

p is odd, 

and we may therefore use operations of type (iii), again involving only 
P%,..., Pom, to reduce x” to €,. 

That is, we have found the required matrix R,. 


(vii) Since we have already obtained a matrix R, such that the first 
column of R, A is €,, we can obtain successively matrices R,,..., Ro»—1 


») 2 

e 

4 

by 


182 A GENERATION OF THE SYMPLECTIC GROUP 


which reduce the next 2m—2 columns to the required forms, i.e. 
matrices such that 


Aon-1 = Ren-1 eee R,A => #2). 


Since = G, 

we have = 1,...,2m—1) 
and Men = 1, 

so that = 1). 
Finally = 

and therefore = I, 

ie. A = (PS, Ren-1 


where each term is the product of a finite number of terms P? and Dr, 
i.e. of powers of Q and values of the linear matrix function D*. 


One of us (T. G. R.) wishes to express his thanks to the Tata Institute 
in Bombay, and the Institut Henri Poincaré in Paris for the facilities 
so generously made available to him while the paper was being written. 


REFERENCES 


E. Dickson, Linear Groups (Berlin, 1901). 
Frucht, J. reine angew. Math. 166 (1932) 16. 


1. L. 
2. R. 


| 
| 


ON THE INTEGRAL EQUATION FOR THE 
FINITE DAM 


By N. U. PRABHU (Karnatak University, India) 
[Received 4 July 1957] 


1. Introduction 
In considering a model for the dam of finite capacity k, Moran (2) 
makes the following assumptions: (i) the inputs X, (¢ = 0, 1, 2,...) which 
flow into the dam in the yearly intervals (¢,¢+-1) are independently and 
identically distributed; (ii) if Z, (< k) is the storage at time ¢ before the 
input X, flows into the dam, then, for Z,+X, > k, an amount Z,+X,—k 
will overflow, but, for Z,+-X, < k, there will be no overflow; the dam 
will then contain a quantity k or Z,+-X,, whichever is the less; (iii) at 
time ¢+-1, the amount of water released is m if Z,+X, > m or Z,+X, 
if Z,4-X, < m, where m < k. It is then clear {Z,+X,} and {Z,} are 
both Markov chains, and, for a given probability distribution of the 
input X,, their stationary distributions may be studied. In the case 
where the probability distribution of X, is of the continuous type with 
the cumulative distribution function (c.d.f.) G(x), so that 
< X,< 2+dz} = dG(xz) (1) 
the c.d.f. H(y) of the stationary distribution of the dam content Z,+ X 
satisfies the integral equation 
m+y 
— (y < k—m), 
H(y) = ” k (2) 
G(y—k+m)— H(t)dG@(m+y—t) (y > k—m) 


together with H(oo) = 1 [cf. Moran (2), where the equation is written 
in terms of the frequency function of the stationary distribution]. 
It might be useful to solve the integral equation (2) for the important 
class of input distributions 
d(x) = 
(p—1)! 
where » > 0 and p = 1,2,...._ Moran (3) obtained an exact solution 
(although by a different technique) for the case of the negative ex- 
ponential input [corresponding to p = 1 in (3)], and also in (4) for the 
general gamma-type input when k + o and the release is continuous. 
Gani and Prabhu (1) studied the case where H(y) is of an approximate 
Quart. J. Math. Oxford (2), 9 (1958), 183-8, 


(0< 24 < 0), (3) 


r 
r? 
‘Ss 
l. 
q m 


184 N. U. PRABHU 


gamma-type, and found that the associated G(x) is also of approximate 
gamma-type, for k + oo but m discrete. In this paper I obtain an exact 
solution for the integral equation (2) when the input distribution is of 
the general gamma-type (3). 


2. Solution of the integral equation in the case of a gamma- 
type input 
Let us consider first the case of a negative exponential input, | 

dG(x) = pe**dx > 0). (4) 

Moran’s solution (3) for the frequency function h(y) of the stationary 

distribution of the dam content is given by the equation 

—(y—qm)** exp{u(y—gqm)}], 
where = e*ifx > Oor < 0,k = (N+1)m+U,0< U <m, 
and c is the normalizing constant. This can be written as 


<y<™m), 


q=0 


h(k—y) = 


(nm < y < (n+1)m; n = 1,2,...,N+1), 

where A = pe-#™. From this we obtain 
y cy <y< Mm), 


(y—qm)? 
q=0 
This suggests the substitution (nm < y < (n+1)m). 


in the integral equation (2) when the input is of pe general gamma- 
type (3); the equation then reduces to 


p-l , 


r=0 


k—m 
pre-wm Ot) ——— dt (-w< y<m), 
(p—1)! 


k-m 
(t—y+m)” 


\ 


ON THE FINITE DAM 185 


which is a mixture of both Fredholm and Volterra types of integral 
equation, with the kernel (t—y+m)?-1!/(p—1)!, but owing to the 
presence of m in the lower limit of the integral on the right-hand side 
(for m < y < k) the known methods for solving such equations are not 
directly applicable. However, we note that the kernel is resolvable: in 


tact, we have 
_ —r— 


Let us put 


r=0 —1)! 


where 


(r = 0,1,...,p—1). (8) 


Pp 
a, = 


8=0 
Then the equation (7) can be written as 


<y <m) (9) 
r=0 


Oy) = 


(y—m—t)P 
— a (m <y<h), (10) 


\r=0 


where \ = (—1)?-u?e-#™. It is seen that the integral on the right-hand 
side of (10) involves ®(t) in the range (0, y—m); this enables us to solve 
for ®(y) successively for the ranges (m, 2m), (2m, 3m),... in terms of the 
unknown constants a, %,-.-, %»-;- For instance, let m < y < 2m; then 


a, dt 


This suggests the general expression 


(y—qm)*" 
= 


(nm < y < (n+1)m; n = 1,2... ..N+1). (11) 


4 
ate 
act 
i | 
| 
4 q 
a 

) ; 


186 N. U. PRABHU 


To prove that this, in fact, is the solution to (9) we use the method of 
induction. Assume that ®(y) is given by the above expression in the 
range 0 << y < (n+1)m. Then for (n+1)m < y < (n+2)m we have, 
from (9), 


4 

p-1 9-1 +*(y—m—t)P-1 


r=0 r=0 q=0 (qp+r)!(p—1)! 


p-1 p-l n 1 qp+p+r 
r=0 


(qp+p+r)! 
2,712, | (qgp+r)! 


Hence the result follows. 

It remains to evaluate the unknown constants ap, a,,..., &,, occurring 
in the expressions (9) and (10), so that ®(y) will then be completely 
known for the entire range (—0o0 < y < k). These, however, are deter- 
mined by (8). We have 


k-—m 
0 


p-1 
= (—1P*- Sa, S (—aye 
2, J (qp+8)!(p—r—1)! 


where 
= f —qm +8, m)P-t- 
>| (qp+s)!(p—r—1)! 


(r,s = 0,1,...,p—1). (12) 


q=0 


Then the equations (8) can be written as 


p-1 p-r-1 k 
ty = > (r = 0,1,....p—1), (13) 
which are p linear equations in the p unknowns a, «,..., «)_, and have 
a unique solution provided that the determinant |1—AD| does not vanish: 
that is, provided that A-1 is not a characteristic root of the matrix 


ON THE FINITE DAM 187 


|\d,,|| = D. Assuming this condition to be satisfied, we have the solution 


Dy) = 


(y—qm)*" 
(nm < y < (n+1)m;n = 1,2,...,N+1). (14) 


3. Stationary distributions of the dam content and the dam 


storage 


I proceed to obtain the stationary distributions of the dam content 
and the dam storage. From (6) we have 


H(y) = YO(k—y), 


which gives, if we take k = (N+1)m for convenience, 


H(y) = 


(qp+r)! 
(sm < y < (s+1)m; s = 0,1,...,N—1), 


(15) 


(y > Nm) 


\ r=0 


for the stationary distribution of the dam content Z,+X,. From (14) 
we note that, for large negative y, ®(y) behaves like y?-!, so that, from 
(6), H(co) = 1, as required. For the dam storage Z, we have the relations 


Pr{Z,,. = 0} = 


Pr{Z,+-X, < m}, 


Pr{Z,,, = k—m} = Pr{Z,+-X, > k}, 

Pr{0 < < 2} = Prim < Z4+X,< m+z} (2 < k—m) 
as a consequence of the release rule. From these relations we see that 
the stationary distribution of Z, has discontinuities at z= 0 and 
z = k—m given respectively by 


F(0) = H(m) = 1—e#*-™ > 


N-1 


(qp+r)! 


r=0 


1—H(k) 
while, in the range 0 < z < k—m, its e.d.f. is given by 
F(z) = H(m+2) 


] > a, 


(qp+r)! 


< (s+1)m; = 0,1,..., N—1) 


. (17) 
r=0 q=0 


(sm <z 


the p-1 
ve, (—2 <y < 

= 
7 


188 ON THE FINITE DAM 


I am indebted to Dr. J. Gani and Professor P. A. P. Moran for many 
helpful suggestions. 


REFERENCES 


1. J. Gani and N. U. Prabhu, ‘Stationary distributions of the negative exponen- 
tial type for the infinite dam’, Royal Statist. Soc. B, 19 (1957) 2. 

2. P. A. P. Moran, ‘A probability theory of dams and storage systems’, Austra- 
lian J. Appl. Sci. 5 (1954) 116-24. 


3. ‘A probability theory of dams and storage systems: modificatiuns of the 
release rules’, ibid. 6 (1955) 117-30. 
4. ‘A probability theory of dams with a continuous release’, Quart. J. of 


Math. (Oxford) (2) 7 (1956) 130-7. 


¥ 
2 
| 


COMPLEX EXTENSIONS 
H. B. SHUTRICK (Liverpool) 


[Received 24 July 1957] 


Introduction 


TuHE object of this paper is to show how real analytic structures of 
differential geometry can be embedded in similar complex structures 
in such a way that they form the real parts of them. A familiar example 
of this kind of technique is the use of imaginary points with complex 
coordinates in real projective geometry. Another example is provided 
by almost-complex structures where it is usual to take complex coordi- 
nates locally to clarify the integrability conditions. 

The paper is in three parts. The first shows how analytic functions 
and regular mappings can be extended from the real to the complex 
field and it leads to the definition of complex extension of a manifold. 
It concludes with some properties and with brief proofs of two theorems: 
one shows the relationship between ‘complex extension’ and ‘real sub- 
manifold’ as defined by Ehresmann, and the other states that the 
germ of complex extension is unique. The second part is devoted to a 
proof of the theorem that any analytic real manifold with a countable 
base of open sets admits a complex extension. It is shown earlier that 
this is equivalent to a theorem stated without proof by Ehresmann. It 
was however obtained independently by the author, and a detailed 
proof is given here partly because I think that the theorem is important 
and partly because it may have been assumed that the proof is more 
obvious than in fact it is. The final part is concerned with the exten- 
sion of manifolds carrying some additional analytic structure. 


1. Definitions and basic results 


1.1. Notation. If f is a mapping of a subset of a space onto a subset 
of another space, it is convenient to consider these subsets as depending 
on f by writing f: U(f)> V(f). If f, g are two such mappings and if 
U(f), V(g) are in the same space, the most useful law of composition 
is (f,g) > fg where fg is the onto mapping 


fo: 9 U(L) Vg)} 9 Vigh}. 


Quart. J. Math. Oxford (2), 9 (1958), 189-201. 


any 
en - 
tra- | 


| 
| 


190 H. B. SHUTRICK 


The real number space R” is always considered as the real part of 
the complex number space C” in the natural way. Let the modulus |z| 


of a complex vector z of C” be given by |z| = J (3% z), where the z; 


are the components of Zz. 

1.2. Extensions of functions. This section may be read in conjunction 
with chapter ii § 2 of Bochner and Martin [(1) 33]. A complex-valued 
analytic function g defined on an open set U(g) of C” is said to be a 
complex extension of its restriction f to R"M U(g). For example, if f is 
a real-valued or complex-valued function defined by an absolutely 
convergent power series F(x—a) in the components of x—a (where x 
is a variable point of R” and a is a fixed point) and U(f) is the open 
ball of convergence given by |x—a! < k, then F(z—a) is convergent 
in the open ball of C” given by |z—a| < k and defines a complex 
extension g. It should be noted that a complex extension is not deter- 
mined uniquely by its real part f because infinitely many restrictions 
and analytic continuations of a given complex extension are also 
complex extensions. 

Sappose, however, that g is a complex extension of f and that 
U = U(g) is connected to R", i.e. each connected component of U 
contains a point of U(f), then g is the only complex extension of f 
defined on U: it is determined uniquely by f and U. 

To prove this we note that analytic functions defined on a given 
connected open set are uniquely determined by their power series at 
a single point. Let U’ be a connected component of U and. let x be 
a point of U’'N R". The coefficients of the power series for f and g at 
x must be the same. Hence f determines g uniquely in U’ and in every 
other connected component. Moreover, the processes of differentiating 
f and g at x are formally the same when expressed in terms of the power 
series, which shows that the derivatives of a complex extension are 
complex extensions of the derivatives of the real part. 

The existence of complex extensions for a general analytic function f 
is now easily verified. Each point of U(f) has an open ball U, centred 
at x and contained in U(f) such that the restriction f, of f to U, is 
given by an absolutely convergent power series. Hence f, can be ex- 
tended to give a complex function g, such that U(g,) is the complex 
open ball centred at x meeting R" in U,. The functions g, fit together 
to form a function g on ae U(g,) 


if, given any two points x, y of U(f), the restrictions of g, and Jy to 


\ 
7 
i 
i 
| 


COMPLEX EXTENSIONS 191 
U(g,) 0 U(g,) are the same. This is true because U(g,)M U(g,) is con- 
nected to R" and the restrictions are both complex extensions of the 
restriction of f to U.N U,. 

1.3. Extension of local automorphisms. Let A® be the pseudo-group 
of analytic, regular homeomorphisms between open sets of R” and let 
As, be the corresponding pseudogroup for C” [Ehresmann (3) 139]. 
Consider the subset A¢ of AS consisting of those mappings which give 
members of A? when restricted to R" and whose inverses do likewise: 


that is, 
AS = {9:9 AS, gine AY, g AP}. 


A member of Aé will be called a complex extension of its restriction to 
R". The conventional mapping of the empty set in R" admits complex 
extensions of the form g € AS such that U(g)N R"® = o. 

The essential properties of A% are contained in the next three 
propositions. 


Proposition 1. The set A% forms a pseudogroup of transformations 
of C. 


In fact, the three conditions for A¢ to be a pseudogroup are easily 
proved, and will only be quoted: 

(a) The sets mapped by members of A‘ are a system of open sets: 
they are, in fact, the open sets of C” because the identity mapping of 
each is a member. 

(6) If g is a (1-1) mapping of U(g) = LU U,, where each U, is open 

a 
in C”, then g belongs to Aé if and only if its restriction to each U, 
belongs to Af. 

(c) The inverse of a member of A‘ is a member as is the composition 
of any two. 


Proposition 2. If g is a member of A‘, defined on U which is connected 
to R", then g is uniquely determined by its restriction f to R" and the 
set U. 

This follows from the corresponding result for functions because the 
n complex functions which define g must be complex extensions of 
those which define f. 


Proposition 3. Every member f of AY has a complex extension g (and 
therefore infinitely many). 


A member f of A® is an analytic mapping of an open set U(f) of R” 


i 
: 
| 
| 
j 
| 
| 
j 
| 
j 
| 
| 
| 
i 
| 
| 
| 
| | 


192 H. B. SHUTRICK 

into R” and can be written f = (/,,f2,.-.,f,), where the f; (¢ = 1, 2,...,n) 
are real analytic functions on U(f). Let g; be complex extensions of 
the f;. For each point x of U(f), let U, be some open neighbourhood 


of x in fal U(g,) and let g, be the mapping of U, into C” defined by 


restwicting the functions g,. I shall prove that, if each U, is chosen 
carefully, then the mapping g formed by fitting together the mappings 
Jx; a8 X varies over U(f), is a complex extension of f. Note that the 
Jacobian of the g; takes the same value as the Jacobian of the f; at x 
and is therefore non-zero. It follows from the implicit-function theorem 
[(1) 39] that the U, can be chosen such that each g, is regular. Let us 
choose the U, such that g, is regular and that g,(U,) is an open ball 
centred at g,(x). Then, for two points x and y of U(f), we have that 


-1 

9x(U,) Ngy(U,) is connected to R". However, the restrictions of g, 
-1 

and g, to this open set are both complex extensions of a restriction of 


-1 
f , and so Proposition 2 ensures that they are the same. This completes 
the proof because g is locally regular and one-to-one. 


1.4, Extension of manifolds. Let M be a real analytic manifold 
defined by a complete atlas o/ of a Hausdorff space onto R" com- 
patible with A® [(3) 139]. A complex extension of M is a complex 
analytic manifold N which contains M as a subset and which has a sub- 
atlas # compatible and complete with respect to Af such that 7 is 
the set of restrictions of members of Z to R". The embedding of M in 
N will then be analytic and proper. 

If N, N’ are complex extensions of M, M’, respectively, and if f is 
an analytic mapping of M into M’, then there exists an analytic 
mapping g of an open neighbourhood U of M in N into N’ such that 
f is the restriction of g to M. To prove this, let U, be a collection of 
open sets in NV which cover M and on which mappings g, are defined by 
extending restrictions of f. The system U, has a locally finite refine- 
ment and therefore also a refinement such that the intersection of every 
pair is connected to M. The restrictions of the mappings g, to open 
sets of the refinement agree in the overlaps and will fit together to 
form g. 

In the case when f is locally-regular, g can be chosen locally-regular. 
In particular, if f is an isomorphism between M and M’, the extension 
process can be applied to f and i. as in Proposition 3, making g an 
isomorphism of U on g(U). 


COMPLEX EXTENSIONS 193, 
The next theorem is an immediate consequence of this. 


Uniqueness THeoremM.t If N, N’ are complex extensions of M, there 
exists a neighbourhood of M in N isomorphic with a neighbourhood of 
M in N’. 


The following theorem relates the complex extension of a manifold 
with Ehresmann’s definition of real submanifold [(2) 417]: 


TueoreM. If N is a complex manifold containing M as a proper, 
real, closed, analytic, submanifold and if dim M = } real dim N, then N 
has a unique subatlas B which makes it a complex extension of M. 


Consider the coordinate mappings f of the manifold N such that 
V(f) does not intersect M and U(f) does not intersect R". These 
mappings form an atlas for N—M. 

For a given point y of M, let g and h be coordinate mappings of NV 
and M respectively giving coordinate neighbourhoods V(g) and V(h) 
of y. Since M is properly embedded, we can choose g such that 


Vig) Mc V(A), and, since the embedding is analytic, gh is an analytic 


-1 
mapping of an open set of R" into C”, which means that gh is given 
by » complex-valued analytic functions f; (i = 1, 2,...,n). We choose 
complex extensions k, of these functions and we take an open neighbour- 


hood U of x = hy) in C” and on which the functions k; are all defined. 
The k; define an analytic mapping k of U into C", and I shall show 
that & is regular if U is taken small enough. This will be true if the 
k,; have non-zero Jacobian at x, which is equivalent to saying that the 
derived mapping k, defined by differentiating the k; at x maps the 


| 

tangent vectors at x onto those at g(y). The restriction of k, to real 
-1 

vectors is the derivative of gh at x mapping the real vectors onto the 


space X of vectors tangential to gi M nN V(g)} at gy). The image of k, 
is the space spanned by X and ,(—1)X because of the linearity of k,, 
so k, will be onto if .(—1)X is transversal to X. The condition that 
M should be a real submanifold is exactly that X and ,(—1)X are 
transversal. It follows that k is regular, and gk is then a coordinate 
mapping of N which takes U(gk) N R" onto Vigk)N M. 

The atlas of N—M combined with the mappings gk as y varies over 


+ I am indebted to Professor H. Cartan for drawing my attention to this 


theorem. 


3695.2.9 


| 

| 

4 

| 

| 


4 


194 H. B. SHUTRICK 


M give an atlas of N compatible with Af. Its completion is the re- 
quired atlas Z. 

Ehresmann [(2) 417] states that, for any analytic manifold M, there 
is a neighbourhood N of the diagonal A in M x M which admits a com- 
plex analytic structure with A, isomorphic to M, as analytic real sub- 
manifold. This implies that a complex extension of any manifold exists 
as will be proved in § 2.+ It is, however, an open question whether there 
is an isomorphism between the 2n-dimensional analytic structures on 
N induced by MxM and the complex structure. An isomorphism 
certainly does exist when M admits an analytic locally-regular em- 
bedding in R”™ (m > dim M = n). In this case, a complex submanifold 
V of C” can be defined by extending the m—n analytic functions which 
define M locally, and V is a complex extension of M. Let g be the 
embedding of M x M in C given by first embedding M x M in R" x R™ 
in the natural way, then by mapping (x, y) onto x+-,/(—1)(y—x). Thus 
g is analytic, locally-regular, and takes A onto the real part of V. 
Moreover, the 2n-dimensional tangent planes to g(.M x M) and V coin- 
cide on the real part, so that the normal (2m—2n)-planes to V set up 
an analytic isomorphism between neighbourhoods of the real part in 
the two manifolds. 


2. Existence theorem 

Every real analytic manifold satisfying the second axiom of countability 
admits a complex extension. 

2.1. Here we set up the basic structure required for the proof. 


Choice of atlas. Assume that the real manifold M is covered by a 
family of coordinate neighbourhoods V(f;), indexed over the positive 
integers, such that each V(/;) is relatively compact and the covering 
is locally-finite of order n+-1. The coordinate mappings f form an 
atlas of’. Such an atlas always exists and can, in fact, be constructed 
from a suitably fine simplicial decomposition of M. Next, refine /’ to 
give an atlas «/ satisfying the same conditions as ./’, but such that 
Viti) V(f;). This can be done as follows. Consider C, = C U Vif), 


which is a closed subset of V(f,). Then, C, has an open neighbourhood 
V, whose closure is contained in V(f), because C, and the frontier of 
V(f;) are non-intersecting closed sets and can be separated in the 
normal space V(f;). This process is now repeated on V(f;) in the 


+ [Note added in proof.) The existence theorem can be proved by putting a 
complex analytic structure on N such that the 2n-dimensional subordinate real 
structure is C® isomorphic to the induced real structure on N. 


| 


COMPLEX EXTENSIONS 195 

covering {V,, V(f%), V(f3),..-}, and so on, giving eventually {V,, V,,...}. 
It is easily shown that {V;} forms a covering of M and that V,c V(f;). 
The new atlas «/ is given by letting f; be the restriction of f; to TV). 
Complex extensions of coordinate changes. Let r: C” + R" be the re- 
traction which maps complex vectors onto their real parts. For each 
pair 7, j such that ¢ > j, choose complex extensions ¢;; and ¢,,; of the 

~1 

changes of coordinates ff; and f ;f; respectively; Proposition 3 ensures 
that this is possible. We make the choice of the ¢,; in such a way that 


($;;) Cc U(¢i;); 
= 


Vifif;)- 
The first condition can be satisfied because the closure of U(f;,f;) is 
in U(¢;;), and the second is possible because the inverse image of an 
open set of R" under r is an open neighbourhood of the set. Let 


di; = o for all i <j and let ¢,; be the identity mapping of the set 
-1 
U; = r {U(f)}; similarly define the mappings ¢;; for all pairs i, j. 
2.2. Method of proof. It is known that M = = U(f,)/R, where R is 
the equivalence relation 
-1 
X; == xX; if x; = 
By analogy, consider the relation S defined in = U; by 
The relation S is reflexive because z; = ¢,,(z;) and symmetric because 
oj; = de However, S may not necessarily be transitive. The procedure 
is therefore as follows: 
Part 1. Replace the mappings ¢,; by suitable restrictions ¢f, making 


S* an equivalence relation. The factor space is then like a complex 
extension except that it may not be Hausdorff. 


Part 2. Replace the sets U; by suitable subsets U} such that 
N = > U*/S* is Hausdorff and is a complex extension of M. 


2.3. To carry out Parts 1 and 2, it is convenient to introduce the 
following open sets of M, R", C”. 


Carriers A,, in M. Consider sets of distinct positive integers 


a = tg) 


| 

j 

| 

| 

| 

| | 

| 


194 H. B. SHUTRICK 


M give an atlas of N compatible with Af. Its completion is the re- 
quired atlas Z. 

Ehresmann [(2) 417] states that, for any analytic manifold /, there 
is a neighbourhood N of the diagonal A in M x M which admits a com- 
plex analytic structure with A, isomorphic to M, as analytic real sub- 
manifold. This implies that a complex extension of any manifold exists 
as will be proved in § 2.7 It is, however, an open question whether there 
is an isomorphism between the 2n-dimensional analytic structures on 
N induced by MxM and the complex structure. An isomorphism 
certainly does exist when M admits an analytic locally-regular em- 
bedding in R™ (m > dim M = n). In this case, a complex submanifold 
V of C” can be defined by extending the m—n analytic functions which 
define M locally, and V is a complex extension of M. Let g be the 
embedding of M x M in C™ given by first embedding M x M in R™ x R™ 
in the natural way, then by mapping (x, y) onto x+-,/(—1)(y—x). Thus 
g is analytic, locally-regular, and takes A onto the real part of V. 
Moreover, the 2n-dimensional tangent planes to g(.M x M) and V coin- 
cide on the real part, so that the normal (2m—2n)-planes to V set up 
an analytic isomorphism between neighbourhoods of the real part in 
the two manifolds. 


2. Existence theorem 
Every real analytic manifold satisfying the second axiom of countability 
admits a complex extension. 


2.1. Here we set up the basic structure required for the proof. 


Choice of atlas. Assume that the real manifold M is covered by a 
family of coordinate neighbourhoods V(f;), indexed over the positive 
integers, such that each V(f;) is relatively compact and the covering 
is locally-finite of order n+1. The coordinate mappings f; form an 
atlas .o/’. Such an atlas always exists and can, in fact, be constructed 
from a suitably fine simplicial decomposition of M. Next, refine .o/’ to 
give an atlas o satisfying the same conditions as ./’, but such that 
V(s,) V(f;). This can be done as follows. Consider C, = C U Vif i), 


which is a closed subset of V(f). Then, C, has an open neighbourhood 
V, whose closure is contained in V(f;), because C, and the frontier of 
V(f;) are non-intersecting closed sets and can be separated in the 
normal space V(f;). This process is now repeated on V(f;) in the 


+ [Note added in proof.|] The existence theorem can be proved by putting a 
complex analytic structure on N such that the 2n-dimensional subordinate real 
structure is C® isomorphic to the induced real structure on N. 


COMPLEX EXTENSIONS 195 
covering {V,, V(f), V(f3),..-}, and so on, giving eventually {V,, Vy,...}. 
It is easily shown that {V;} forms a covering of M and that V,c V(f;). 
The new atlas «/ is given by letting f,; be the restriction of f; to f'W)). 

Complex extensions of coordinate changes. Let r: C” + R" be the re- 
traction which maps complex vectors onto their real parts. For each 
pair 7, ) such that i > j, choose complex extensions ¢;; and ¢,; of the 

~1 
changes of coordinates f;/; and f ,f, respectively; Proposition 3 ensures 
that this is possible. We make the choice of the ¢,; in such a way that 


= 


The first condition can be satisfied because the closure of U(f;,f;) is 
in U(¢;;), and the second is possible because the inverse image of an 
open set of R" under r is an open neighbourhood of the set. Let 

-1 
¢;; = $;; for all i <j and let ¢,, be the identity mapping of the set 

-1 
U; = r {U(f,)}; similarly define the mappings ¢/,; for all pairs i, j. 

2.2. Method of proof. It is known that M = = U(f;)/R, where R is 
the equivalence relation 
-1 
xX; = x; if x; = fif(X;). 

By analogy, consider the relation S defined in = U; by 
The relation S is reflexive because z; = ¢,,(Z;) and symmetric because 
oi; = re However, S may not necessarily be transitive. The procedure 
is therefore as follows: 


Part 1. Replace the mappings ¢,; by suitable restrictions ¢%, making 
S* an equivalence relation. The factor space is then like a complex 
extension except that it may not be Hausdorff. 


Part 2. Replace the sets U; by suitable subsets Uf such that 
N = > U*/S* is Hausdorff and is a complex extension of M. 


2.3. To carry out Parts 1 and 2, it is convenient to introduce the 
following open sets of M, R", C”. 


Carriers A,, in M. Consider sets of distinct positive integers 


[= 


| 

| 

| 


196 H. B. SHUTRICK 


such that Aw=NVifi) 
that is, « indexes a non-empty carrier. It will be noted that 
l<q<n+l. 


An intersection relation satisfied by the carriers is 
= Ay, if a, BCy,« #8. 

Neighbourhoods P,gin M. Associate with each set Ag, ,, the set Bg, 
which is the union of all sets A,, such that ac 8. Denote the frontier 
of an open set A by @A (@A = A—A) and consider @Agq,4) 1 Bags 
The set A,,—Agg, is closed in By,.. if Ba because it is the 
complement of the union of all other carriers of order q contained in 
It follows that Aq, is closed in Bg, 80 the subset 
CA gg Of is partitioned into g+1 closed sets of the 
form @Agy.: Aq. The space Bg,,, is normal and therefore there 
exist g+1 non-intersecting open neighbourhoods P%g, respectively, of 
the q+1 closed sets. The set Ag, ,., which is deduced from .o/’ in the 
way that Ag,,, was deduced from ./, is an open neighbourhood of 
Aga.:- Let P,g be the intersection of P*z; with A,, and with this 

Big+1 af B aq 
neighbourhood. 

The properties of the open sets Pg can be summarized as 

(i) =9 ifa,Bcy,a #8, 

(i) Pygc if ies, 

(iii) P,g is a neighbourhood of @Ag,..)9 Ag, in Ago. 

Sets Ai,, Pig in R". For each i in a, let Ai, = f,(A,,). Thus we 
have 

(Ay) Ai, nN Abn => if i B Cc 3 B. 
Also, the closure of A,, is in A),, and so the frontier points of A/, will 
be mapped onto those of A}, by ¢j;. Let Pig = f((P,g) for each i € 8, 
giving 
(P;) Pry = fifj( Php) if i, 7 € B, 
(Py) Pi if a, BCy,a #8, 

(Pix) Iftiea, Pig is a neighbourhood of in Aig. 

Sets C\, in C”. I shall define open neighbourhoods Ci, of Ai, in U; 
such that, for each pair i, j in a, 

= 


| 

| 

| 

| 
| 
| 

| 

| 

i 


COMPLEX EXTENSIONS 197 


If we cons’der mappings such as 


composed of mappings, where the pairs i, k; k, 1; ...; m, j are taken 
from a, we note that each @,;, and in particular ¢,;, is a complex ex- 


tension of the mapping rs f;. We therefore choose the sets C{, con- 
nected to R" ensuring that the restrictions of 0,; and ¢,; to Ci, are the 
same mapping for all i, j7, @ by Proposition 2. 

Let 9,; be a mapping similar to 6,; above but let the bar indicate 
that every ordered pair /, m of « is included in the mappings ¢,,, com- 
posed. Define C\, to be the union of those connected components of 
U(8,;) which contain connected components of Aj). 

It must first be shown that, if C%) is defined by another such mapping 
then CY = Ci. Assume, alternatively, that one of them, say 
has some points not included in the other. Let ¢,,, be the first of the 
mappings in @f; which does not map the full image of Ci, under 
previous components. Then, ¢,,, also occurs somewhere in 9,; and, in 
this case, it does map the full image under previous components. The 


previous components in each case are complex extensions of Fad 80 
that they map Ci), in the same way. This leads to a contradiction from 
which we that = Ci,. 

To prove that Ci, = 0, (Cha ), let 6% *. include all ordered pairs from a 
which do not occur in 6;;, so that 6,,6% 6,; and 6,; 6%, contain all ordered 
pairs. Hence, 6,;6%,0,; maps Cj, and, consequently, 6,67; maps 6,;(C4,) 
which is therefore contained in Cia. The equality follows from a similar 
argument with 

Complex carriers Di,. The sets Ci, are, to some extent, complex 
analogues of the A‘, but they do not satisfy the intersection relation 
next step is to define subsets of such that 


(Dy) = Poa) if k,j € a, 


(Dy) is an open neighbourhood of Aj, in 7 (At, ). 

These sets are defined by an inductive process starting with 
Dio.) = Cinsy and defining Di, in terms of Di... Let Di, be the 
subset of Ci, given by 

(a) 
or (6) r(z)e Pi, 
or (c) ¢ 


| 
| 
| 
| 
| 
| 
| 


198 H. B. SHUTRICK 
and let Dig = 1) 

LY 
where the intersection ranges over all j in a and over all y for which 
P',, is defined. It will be noted that the intersection contains a finite 
number of terms, because a given coordinate neighbourhood intersects 


only a finite number of others. 
Assume (D,;;) true for each Di,,,,). Then 


Ava by (a), (6), (c), 
c Ct, by definition, 
~ U(¢,;), by definition of C{, in § 2.3, 


cr(At,), by (®) for in § 2.1. 
Hence, (A;) gives Ai, c Di, cr(Ai,). The subsets of Ci, given by (a) 
and (b) are open, so that the only possible non-interior points of D\, 


are in r(CAbg »), which is contained in (b). Hence, each D\, is open, 
which implies that Di, is open. The sets Di, ,,) certainly satisfy (Dj,,;); 
so (D,;;) is true for all g by induction. 

Condition (D,) is satisfied because 


Dia = Pry) 
=f) $M Di,), since Dic 
tea 
$5(Di,)) 
= 


We now verify condition D,;. Remember that D‘, c r(Aj,). Hence, 
ifiea, Bc y, it follows from (A,,) that 
Also, by (a), (6), (c), any point of intersection of the two sets is either 


in D\yq.1), or in the intersection of two sets r(P%,) and r (Phy). Hence 
= Bi, n 
Condition (D,,;) follows by using 
= $ij(Digiy) and Di,c 
2.4. We are now able to proceed with the proof as indicated in § 2.2. 


(P;,;) shows that 


= 


COMPLEX EXTENSIONS 199 
Part 1. Let $f; be the restriction of ¢,; to Dj. and let $f = $4. 
Thus, ¢¥; is the restriction of ¢;, to 


= Dinas 
which is 47;. Also, if z; = $%(z;) and z; = $%,(z,), then 
2; © Dina N 
= Diinz by (Dy), 
= by (Dy). 
Hence, Z; € Dk, by (D,), and z; = $%,(z,) by the uniqueness property 
of mappings on 
The relation S*, 
z,S*z, if = 
is an equivalence relation in = U;. 

Part 2. The quotient mapping of = U;,/S* is open and this makes 
it easy to specify exactly which pairs of points of = U; give rise to 
separable points of the quotient. If two points are in the same U,, 
they clearly have separable images. Let z,; and z,; be points of U; and 
U; respectively whose images are not separable. Then z; cannot be in 
U(¢};), for it would then be equivalent to a point of U;. On the other 
hand, any neighbourhood of z; must have points equivalent to points 
in U;, and so z, must be a frontier point of U(¢j;). Similarly, z, is a 
frontier point of U(¢%). If z; # ¢;,(z;), we can separate these points 
by enclosing them in open sets W, and Wj, respectively, of U;, and the 
quotient images of the open sets W; and ¢;,(W; 9 U(¢;,)) are non-inter- 
secting, which contradicts the hypothesis that z; and z; have non- 
separable images. We conclude that the only pairs of points of = U;/S* 
which are not Hausdorff-separable are of the form z,, ¢;;(Z;), where Z, 
is a frontier point of Dijjj9. Let C%ji, be obtained from the ¢j; in the 
way that Ci, are obtained from the ¢,;. Let 


Thus Qj is an open neighbourhood of @Ajjp_ in U; and 
QN Fil =, by (P;) and 
Let U, = (U;—@Dhiyya) U Qj, 80 that U;, is a neighbourhood of U({,) 


in U;. It is then clear that U,;+U;;,, factored by the restriction of S* 
to it, is Hausdorff. Hence let 


| 


200 H. B. SHUTRICK 


where the intersection is taken over the j for which 


Vif) OV 


The intersection has a finite number of terms so that U} is open and the 
space © U}/S* being Hausdorff is the required complex extension of M. 


3. Extension of analytic structures 

It frequently occurs that analytic structures on the real manifold WM 
are of types which are uniquely determined by giving analytic functions 
hi (a = 1,2,...,7) on each open set U(f;,) of the atlas o used in the 
construction of the complex extension. Usually, the structure and »/ 
do not determine the functions uniquely: the h/, may represent an 
analytic equivalence class of sets of r functions defined on U(f;). Also 
the hi functions do not define a structure unless certain analytic 
identities are satisfied in the coordinate overlaps. These identities in- 


-1 
volve the change of coordinates f;f;, the functions hj, the functions 


hif f;, and their derivatives. An extension of such a structure is given 
by constructing an extension N of the manifold M as in § 2 but, 


(i) choosing the sets U; in § 2.1 as the subsets of r( U(f;)) on which 
complex extensions kof the hi, are defined, and, 

(ii) choosing the ¢,; (defining them over sufficiently small neighbour- 
hoods) in such a way that the complex extensions of the analytic 
identities are satisfied. 

Suppose, for example, that MV is a Riemannian space, so that the 
structure is determined by giving the components g,,, of the metric 
tensor in each coordinate neighbourhood and the identities in the over- 
laps are the usual tensor law of transformation. The complex extension 
is a complex manifold N with a complex tensor g,, which, having a 
rank dim M, does not define a metric on N but which is such that the 
imaginary complex directions at a point of M are realized as directions 
transversal to M in N. For instance, the null cone of a positive definite 
metric appears in the complex extension. 

An analytic almost-complex structure on a 2n-dimensional manifold 
M is determined by giving an analytic field of complex n-elements X,, 
such that X,, and X,, have only one point in common, the origin of the 
tangent space [(2) 414]. The almost-complex structure on M can be 
extended to a complex manifold N as above, and the field of imaginary 
n-elements X,, give a field of complex n-elements Y,, in N which are 
transversal to M. 


4 
— 


COMPLEX EXTENSIONS 201 


The almost-complex structure on M is complex if and only if there 
is a subatlas of preferred coordinates such that X,, is given by 
= 0 (i = 1,2,...,n; i = i+n). 
A necessary and sufficient condition that the almost-complex structure 
on M should be complex is that the field of complex n-elements Y,, is 
completely integrable in N. The condition is obviously necessary 
because the equations du‘'+.(—1)du“ = 0 are completely integrable. 
Conversely, if Y, are tangent elements to laminas given locally by 
putting analytic functions g‘ equal to constants, then 
gi = 
on M, where h', h® are real analytic functions. These 2n functions are 
independent because the laminas are transversal, and thus they give the 
preferred coordinate systems. The integrability condition is the usual 
one when expressed in terms of the forms, that is, 
dw' = 0 (mod w"*), 
where w' are the complex forms defining X,,. 
Patterson's theorem [(4) 266], that an analytic almost-Kihler metric 
is always Kahler, can be deduced in V from the fact that parallel planes 
are integrable. 


REFERENCES 


1. S. Bochner and W. T. Martin, Several Complex Variables (Princeton, 1948). 

2. C. Ehresmann, ‘Sur les variétés presque complexes’, Proc. Int. Cong. Math. 2 
(1950) 412. 

3. —— ‘Structures locales’, Ann. di Mat. Pura ed Appl. (Roma) 36 (1954) 133. 

4. E. M. Patterson, ‘A characterisation of Kahler manifolds in terms of parallel 
fields of planes’, J. Lond. Math. Soc. 28 (1953) 260. 


| 
| 

i 


200 H. B. SHUTRICK 


where the intersection is taken over the j for which 


Vif) #9. 


The intersection has a finite number of terms so that Uf is open and the 
space & U*/S* being Hausdorff is the required complex extension of M. 


3. Extension of analytic structures 

It frequently occurs that analytic structures on the real manifold M 
are of types which are uniquely determined by giving analytic functions 
hi (a = 1,2,...,7) on each open set U(f;) of the atlas of used in the 
construction of the complex extension. Usually, the structure and «/ 
do not determine the functions uniquely: the hi, may represent an 
analytic equivalence class of sets of r functions defined on U(f;). Also 
the hi, functions do not define a structure unless certain analytic 
identities are satisfied in the coordinate overlaps. These identities in- 


volve the change of coordinates f;f;, the functions hj, the functions 


hif, f;, and their derivatives. An extension of such a structure is given 
by constructing an extension N of the manifold M as in § 2 but, 


(i) choosing the sets U; in § 2.1 as the subsets of r( U(f,)) on which 
complex extensions k‘, of the h‘, are defined, and, 

(ii) choosing the ¢,,; (defining them over sufficiently small neighbour- 
hoods) in such a way that the complex extensions of the analytic 
identities are satisfied. 

Suppose, for example, that M is a Riemannian space, so that the 
structure is determined by giving the components g,,, of the metric 
tensor in each coordinate neighbourhood and the identities in the over- 
laps are the usual tensor law of transformation. The complex extension 
is a complex manifold N with a complex tensor g,, which, having a 
rank dim M, does not define a metric on N but which is such that the 
imaginary complex directions at a point of M are realized as directions 
transversal to M in N. For instance, the null cone of a positive definite 
metric appears in the complex extension. 

An analytic almost-complex structure on a 2n-dimensional manifold 
M is determined by giving an analytic field of complex n-elements X,, 
such that X,, and X,, have only one point in common, the origin of the 
tangent space [(2) 414]. The almost-complex structure on M can be 
extended to a complex manifold N as above, and the field of imaginary 
n-elements X,, give a field of complex n-elements Y, in N which are 
transversal to M. 


| 


COMPLEX EXTENSIONS 201 


The almost-complex structure on M is complex if and only if there 
is a subatlas of preferred coordinates such that X,, is given by 
= 0 (i = 1,2,...,n; i’ = i+n). 
A necessary and sufficient condition that the almost-complex structure 
on M should be complex is that the field of complex n-elements Y,, is 
completely integrable in N. The condition is obviously necessary 
because the equations du‘'+.(—1)du‘ = 0 are completely integrable. 
Conversely, if Y, are tangent elements to laminas given locally by 
putting n analytic functions g‘ equal to constants, then 
gf = 
on M, where h', h® are real analytic functions. These 2n functions are 
independent because the laminas are transversal, and thus they give the 
preferred coordinate systems. The integrability condition is the usual 
one when expressed in terms of the forms, that is, 
dw' = 0 (mod 
where w' are the complex forms defining X,,. 
Patterson’s theorem [(4) 266], that an analytic almost-Kahler metric 
is always Kahler, can be deduced in N from the fact that parallel planes 
are integrable. 


REFERENCES 


1. S. Bochner and W. T. Martin, Several Complex Variables (Princeton, 1948). 

2. C. Ehresmann, ‘Sur les variétés presque complexes’, Proc. Int. Cong. Math. 2 
(1950) 412. 

3. ——— ‘Structures locales’, Ann. di Mat. Pura ed Appl. (Roma) 36 (1954) 133. 

4. E. M. Patterson, ‘A characterisation of Kahler manifolds in terms of parallel 
fields of planes’, J. Lond. Math. Soc. 28 (1953) 260, 


| 
3 
5 


ON A FUNCTIONAL EQUATION 


By T. W. CHAUNDY and J. B. McLEOD (Ozford) 
(Received 24 July 1957; in revised form 2 December 1957] 


1. Ir is a problem of some interest in the statistical thermodynamics 
of mixtures} to obtain the general solution of the functional equation 

f(x)+-uf(vx) = Uf(Vx), (1) 
where z, wu, v are independent parameters, f(x) is the unknown function, 
required to be continuous, and U, V are (unknown) functions of u, v 
alone whose form will depend (presumably) on the form of f. For the 
purposes of the problem we need consider only positive values of x, u, v, 
U, V; but for the mathematical analysis it is reasonable, in view of the 
symmetry of (1), to admit values of u, U of either sign, and this may 
lead to imaginary values of v, V, as may be seen from the solutions 
asserted below. 

Since V is defined implicitly, it may well be many-valued: to 
distinguish between these possible branches it is not unreasonable to 
impose on V (u,v) the condition of being continuous. 

We show that the general solution of (1) is 


f(x) = Ax*+ Be’, (2) 
where A, B, a, b are constants; U, V are then given by 
1+uvt = l+uv’ = UV?. (3) 


There are two exceptional or limiting cases of this. Corresponding to 
the limit b + a, we may have 


f(x) = (A+ Blog x)z*, (4) 
where now 1+uv* = UV2, ulogy = UlogV. (5) 
Corresponding to B = 0, we may have 
f(z) = (6) 
and then wu, v, U, V are connected by the single relation 
1+uvt = UV4, (7) 


We exclude as sufficiently pointless the solution in which A = 0, B = 0 
and f(x) is identically zero. 


+t See W. B. Brown, Phil. Trans. (A) 250 (1957) 175. 
Quart. J. Math. Oxford (2), 9 (1958), 202-6. 


i 


ON A FUNCTIONAL EQUATION | 203 
2. Suppose first that we can deduce from (1) a two-term relation 
f(xt) = T(z), (8) 
where 2, ¢ are independent parameters and 7’ is a function of ¢ alone. 
Taking / as a particular value of x we have 


Sh) f(at) = f(ht) f(z). 
The solution of this is well known; it is, perhaps, even better known if 
we write it as 
logf (et) +log f(e"**) = logf(e"**) +logf(e"), 
ice. F(ut+v)+F(k) = F(u)+ F(v+k), (9) 
where wu, v are independent. It is notorious that the only continuous 
solution of (9) is 


F(u) = au+e, 
where a, ¢ are constants, and so the only continuous solution of (8) is 
f (x) = Az", 


where A = e°. This is the solution (6). 
3. Returning to (1) in the general case take ¢ any value of z and, 
keeping ¢ a disposable parameter, write 
u= —f(t)/f(et), 
so that U, V are now functions of the independent parameters v, t: say 
U = U(t,v), V = Vit,v). Thus 
f(x) —f()f (vx) /f(vt) = v)f{xV(t, v)}. (10) 
This gives, when x = t, either 
(i) U(t,v) = 0 or (ii) f{tV(t,v)} = 0. 
If (i) holds for any t, v, we have the two-term relation 
f(x) f(vt) = 
in three independent parameters, and this is covered by § 2 and leads 
to the solution (6). 
4. Alternatively, from (ii), we have, for all ¢, v, 
f (x) = 0, 
where x = x(t,v) = tV(t,v). 
Then 2(t,v) is continuous by the continuity imposed on V, and so, in 
general, f(x) vanishes for a continuous region of values of x: in other 
words it is identically zero over an interval. Rejecting this we have, . 
as the only alternative, that x(t, v) is some constant c: that is, 
Vit,v) = (11) 
where c is a zero of f(z). 


3 J 

} 

| 
a 
} 

i 

rt 

1 

? 


204 T. W. CHAUNDY AND J. B. McLEOD 
We accordingly rewrite (10) as 


f(x) f(vx) = Ut, v) f(ct2x), (12) 
where f(c) = 0. Putting x = ¢ gives us 
f(t)f(cvr) 
f(ct) 
Write further t = cw~! and we get 


f(cw-") f(ev) 
f(x)— flevw =p fer = x); 
ie. — _ flex) (13) 


f(ev) flew)’ 

Since f(x) is not to vanish identically, we have f(z,) ~ 0 for some 2». 
By adjusting constants in x, f we may sufficiently write this f(1) = 1. 
This will change the value of c, but we sufficiently continue to indicate 
it by the same letter. Then x = 1 in (13) gives 


flevw)f(1) flv) 
f(ew) ~ f(cv) flew)’ 
and so we can write (13) as 
S(vx)—flv) f(x) f(wx)—f(w) f(x) 
f(ev) f(cw) 
Thus each side must be independent of v, w and accordingly equal to 
some function of x only. We write therefore 


f(vx) = (14) 
Symmetry in x, v gives 
S(cv)h(x) = h(v)f(cx), 
and so h(x) = Af(cx) for some constant A. Then (14) is 
f(vx) = f(v)f(x)—Af(er)f (ex), 


and we write this more conveniently 


f(vx) = f(v)f(x)—g(v)g(x), (15) 
where g(x) = VA f(cx), so that g(1) = 0. Substituting vx, v-' for x, v, we 
fle) = flv) 


With (15), eliminating f(vx), we get 


— MEW Ge), 


g(v-*) 


ON A FUNCTIONAL EQUATION 205 
Here x = | gives, since g(1) = 0, 


—g(v)g(v) = 1, (17) 
and (16) becomes 


g(ve) = fe) 8) 


5. From (15), (18) we can write 


= — a) a) — (19) 


g(v*) 
where A is a disposable multiplier. The right-hand member is a multiple 
of f(x)—Ag(x) if 
f(r) f(v- (20) 
g(v)* go) 


We can reduce the middle term: for, from (18), 
fle) _ fie) 
g(x) g(v)g(x) g(x~*) 
by interchange of z, v. Then 
fle) f(x), f(z) ah 
on = 2h, 21 
gv) gw) gla)” ge) 
a constant, since it is independent of v, x. 
Accordingly A satisfies the equation 
= 0 (22) 
and has, in general, two distinct constant values p and q (= p=) say. 
With these, (19) gives two equations of the form 
f(vx)—pg(vx) = P(v){f(x)—pg(x)}, 
f(vx)—gg(vx) = Q(v){f(x)—9g(2)}- 
By § 2 these give, when we remove the condition f(1) = 1, 
S(x)—pg(z) = Rx®, f(x) —qg(x) = Sx” 
for some constants R, S, a, 6, and elimination of g(x) gives the required 
form (2). 

6. Exceptionally, (22) has equal roots when h = +1. If we change 
the sign of g, we change the sign of A and of A, but nothing essential 
alters. In this exceptional case we may therefore take h = 1 and so 
A = 1. Then, with the help of (17), (19) becomes 

f(vx)—g(vx) = 
and so, still with f(1) = 1, 


f(x)—g(x) = x* (23) 


t 
‘ 
i 
7 
} 
( 
§ 


206 ON A FUNCTIONAL EQUATION 


for some constant a. Write 
g(x) = 


Then from (15), (23) we get 
G(vx) = G(x)+G(v), 
say = 
of which the only continuous solution is 
Ge") = Cy, ie. G(x) = Clogz. 

Thus, removing the condition f(1) = 1, we get finally 

f(x) = (A+ Blog z)zx*, 
which is the limiting form (4). | 


4 
a} 


ON THE COMMUTATOR SUBRING 
By J. B. McLEOD (Ozford) 


{Received 1 August 1957; in revised form 30 December 1957] 


1. Introduction 

Let S be a ring with a subfield K such that the elements of K commute 
with all elements of S. Let S have a unit element in K. S is said to 
be of freedom f over K if f is the minimum number of elements of S 
which, together with K, generate S polynomial-wise. 

If f = 1, S is commutative. If f = 2 and K is of infinite charac- 
teristic, I prove that the commutator subring C of S (the subring of 
S generated by elements of the form ros = rs—sr with r, 8 in 8) is 
a two-sided ideal of S.+ 

In § 5, we obtain examples of S 

(i) with f = 2 and K = GF(2); 

(ii) with f > 3 and K arbitrary, 
in which the commutator subring of S is not an ideal. 


2. Tuzorem. Let f = 2 and K be of infinite characteristic. Then C 
is an ideal of S. 


Proof. Let x, y, together with K, generate S. If r,s ¢ S, we write 
r=s if r—seC. To prove the theorem, one easily sees that it is 
enough to prove that, for each r, s, tin S which ace monomials in z, y, 


risot) = 0, (sot) = 0. 
Hence it certainly suffices to prove the following lemma: 
Lemma. Let m,, nj,..., m,, n, be non-negative integers. Then 
xm y™,,, y™ = 


where a= ym, b= yn. 


The proof of this lemma is by induction. The result is trivial if 
r= 1. The proof for r = 2 is given in § 3. The general argument by 
induction from r = n > 2 tor = n+1 is given in § 4. 


+ The problem arose from a discussion with Dr. 8. A. Jennings. Cf. the intro- 
duction to his paper, Duke Math. J. 9 (1942) 341-55. 


Quart. J. Math. Oxford (2), 9 (1958), 207-9. 


| 
{ 
; 
H 
4 
4 
.§ 


206 ON A FUNCTIONAL EQUATION 


for some constant a. Write 
g(x) = 


Then from (15), (23) we get 
G(vx) = G(x)+G(v), 
say Glew") = + Ge"), 
of which the only continuous solution is 
Ge") = Cy, ie. G(x) = Clogz. 

Thus, removing the condition f(1) = 1, we get finally 

f(x) = (A+ Blog z)zx*, 
which is the limiting form (4). 


ON THE COMMUTATOR SUBRING 
By J. B. McLEOD (Ozford) 


{Received 1 August 1957; in revised form 30 December 1957] ’ 


1. Introduction 

Let S be a ring with a subfield K such that the elements of K commute 
with all elements of S. Let S have a unit element in K. S is said to 
be of freedom f over K if f is the minimum number of elements of S 
which, together with K, generate S polynomial-wise. 

If f = 1, S is commutative. If f = 2 and K is of infinite charac- 
teristic, I prove that the commutator subring C of S (the subring of 
S generated by elements of the form ros = rs—sr with r, 8 in 8) is 
a two-sided ideal of S.t+ 

In § 5, we obtain examples of S 

(i) with f = 2 and K = GF(2); 

(ii) with f > 3 and K arbitrary, 
in which the commutator subring of S is not an ideal. 


2. Turorem. Let f = 2 and K be of infinite characteristic. Then C 
is an ideal of 8. 


Proof. Let x, y, together with K, generate S. If r,s eS, we write 
r=s if r—seC. To prove the theorem, one easily sees that it is 
enough to prove that, for each r, s, 1 in S which are monomials in 2, y, 


r(sot) = 0, (sot)r= 0. 
Hence it certainly suffices to prove the following lemma: 
Lemma. Let mj, n,..., m,, 2, be non-negative integers. Then 


where a 2 ‘ 


The proof of this lemma is by induction. The result is trivial if 
r= 1. The proof for r = 2 is given in §3. The general argument by _ 
induction from r = n > 2 tor = n+1 is given in § 4. 


+ The problem arose from a discussion with Dr. 8. A. Jennings. Cf. the intro- 
duction to his paper, Duke Math. J. 9 (1942) 341-55. 


Quart. J. Math. Oxford (2), 9 (1958), 207-9. 


| | 
; 

| 

| 

3 { 

} 


208 J. B. McLEOD 
3. Proof of the lemma for r = 2 
Ifa = Oora = 1, the lemma is easily verified directly. Suppose that 
a > 2 and let m, n be arbitrary integers such that 
a—l1>m>2>l, b>n>0. 


Now 


and hence 


(3.1) 
Thus, putting m = 1, 2,..., a—1 successively in (3.1), we have 
Syn 
It follows that 
— == — aty?), (3.3) 


The left-hand member of (3.3) is 
—aty’ € C. 
Thus, from (3.3), it follows that, if K is of infinite characteristic, then 
arty? = 
= 
for each integer s such that 0 < s < a, by (3.2). This proves the lemma 
for r = 2. 
4. Proof of the lemma for r > 2 
We now take r > 2 and, assuming the truth of the lemma for r—1, 
prove it for r. Then 


(4.1) 
The second term in (4.1) is congruent with 


ON THE COMMUTATOR SUBRING 209 


and then by the induction hypothesis the second and third terms are 
congruent with —2*y’ and the fourth term is congruent with zy’. Hence 
the first term is congruent with x*y’, and the lemma is proved. 


5. It remains to give examples of S 

(i) with f = 2 and K = GF(2); 

(ii) with f > 3 and K arbitrary, 
in which the commutator subring of S is not an ideal. 

(i) We take our example from the free ring generated i non- 
commuting indeterminates x, y over GF(2). Then 

= x(xy—yx)y, 

and so belongs to the ideal generated by the commutator subring C. 


xy?—ayzy ¢ C. 
For the only product of commutators which can be used to ‘connect’ 
and is (ry—yx)*. But 
(zy—yx)® = xyxy—yaty + 
= 

(ii) We take our example from the free ring generated by f non- 

commuting indeterminates x, y, z,... over K. Then 
x(yz—zy) 

belongs to the ideal generated by C, but not to C. For, to express 
x(yz—zy) as a member of C, we must use a commutator of the third 


degree, e.g. (xy) 0 2. 


But any such commutator connects two terms in which the cyclic order 
of the factors is the same, and this is not the case for 


x(yz—zy). 


t 

} 

if 

t 

H 

? 
3695.2.9 P 


ONE-DIMENSIONAL CHARACTERISTICS OF A 
PARTIAL DIFFERENTIAL EQUATION OF THE 
SECOND ORDER, WITH ANY NUMBER OF 
INDEPENDENT VARIABLES 


By D. H. PARSONS (Reading) 
[Received 1 August 1957] 


In any attempt to extend the classical theory of partial differential 
equations of the second order to equations with more than two inde- 
pendent variables, the first problem is that of defining ‘characteristic 
multiplicities’. Goursat [(2) 219 chap. x] has pointed out that, out 
of several possible definitions, two are naturally indicated. Firstly, if 
there be m-+-1 independent variables, one can start from the problem 
of Cauchy, generalized, and define a characteristic multiplicity of order 
n(n > 2) to be an m-dimensional multiplicity of elements of contact 
of order n, contained in an infinity of integral multiplicities. This 
definition, adopted by J. Beudon (1), is satisfactory up to a point, but 
does not lead to an extension of Darboux’s method. 

The second definition is as follows. Consider an equation of the 
second order, ¢ = 0 say, with one dependent and m-+-1 independent 
variables. Then we define a characteristic multiplicity of order n (n > 2) 
to be a one-dimensional multiplicity of elements of contact of order n, 
contained in at least one integral multiplicity, and satisfying at least 
one total differential equation distinct from the equation dé = 0 and 
the equations of contact, which contains the differential of at least one 
derivative of order n, and which is independent of the integral contain- 
ing the multiplicity. This definition is analogous to that adopted by 
Natani (3) for characteristics of the first order. 

I have shown (4) that, when there are three independent variables, 
an equation of the second order admits two families of characteristics 
of this kind, if it be of rank 2, one family if it be of rank 1, but none 
if it be of rank 3. I have also (5) extended the definition of rank to 
equations with any number of independent variables and shown that 
the rank is invariant under contact transformation. 

I shall now extend this result to equations with m+ 1 independent 
variables; and we shall see that such equations admit characteristics 
of this kind if the rank be 2 or 1, but not if it be 3 or more. To simplify 


Quart. J. Math, Oxford (2), 9 (1958), 210-20 


= = 


ON ONE-DIMENSIONAL CHARACTERISTICS 211 


the writing of the proof, I shall consider characteristics of the second 
order only. The definition of characteristics of higher order follows 
exactly the same lines, and the results are entirely similar, the only 
difficulty being the greater complication of the notations required. 

Let the independent variables be z, y,,..., y,,, and let z be the de- 
pendent variable. Let 


ey; = ty (i,j = 1,...,m). 
Also let 


= Ty @z/excy; Cy; = @z/éy, Cy; Oy, = 
(i,j,k = m). 


We may always suppose that the given equation contains the derivative 
r. For, if it does not contain @*z/éx? but contains any one of @*2z/éy? 
(¢ = 1,...,m), a change of notation reduces it to the required form. 
Again, if the equation does not contain é*z/éx* nor any of é2/éy? 
(¢ = 1,...,m), but contains, say, 6*z/éy,éy,, the change of variables 


gives an equation containing @°2z/éX?. 

Thus we suppose that the given equation contains the derivative r. 
We shall deal with analytic equations only, and consequently we may 
suppose the given equation solved for r. Let the given equation be 


11 “mm 

where F is an analytic function of its arguments in the neighbourhood 
of a set of initial values. 

We may suppose that every partial derivative involving two or more 
differentiations with respect to x is expressed, in terms of the variables 
and the remaining partial derivatives, by means of (1) and the equa- 
tions derived from (1) by differentiation. Thus the only equations of 
contact which we need consider are those containing partial derivatives 
which involve not more than one differentiation with respect to z. 
Let us now use the symbol dF /dy; to indicate the result of differentiating 
F with respect to y,, treating z and each partial derivative as a function 
of ©, Ym; and let (dF'/dy,) indicate that in calculating dF /dy; we 
omit all terms involving derivatives of z of the third order. Let 


éF /és, = /ét,; = T;; (i,j = m). 


212 D. H. PARSONS 
Then, with this notation, differentiating (1) with respect to y;, we have 
—(dF/dy;)— p> 8; T 55 2 Tx = m). (2) 


The equations of contact - the third order, when we use (1) to ex- 
press the ve of r, are 


dz—pdx— = 0 ) 
dp+Fdx— ¥ s,dy, = 0 , (3) 
j=1 
2 t,;dy; = 0 = m) ) 


ds, = r,dx + 2 (i = 1.,..., m), (4) 
dt;; sy det (i <j; i,j | m). (5) 


Substituting for r; from (2) in (4) and bearing in mind that s,; = s,,, etc., 


we have 
(i = 1,...,m). (6) 


Now, given any integral element of contact of the second order, i.e. 
given any set of values of the variables and the partial derivatives of 
the first and second orders, satisfying (1), it follows from the usual 
existence theorem that an infinity of integrals of (1) exist admitting 
this element, and such that all those partial derivatives of the third 
order which occur on the right of (5) and (6) take, simultaneously, any 
arbitrarily chosen set of values. Thus suppose, if possible, that a multi- 
plicity of integral elements of contact of the second order satisfies a 
total differential equation, distinct from dr+dF = 0 and equations (3), 
containing at least one of ds;, dt;; (i,j = 1,...,.m), which is independent 
of the integral containing the multiplicity. Then the right-hand sides 
of the equations (5) and (6), considered as linear forms in the partial 
derivatives of the third order, must be linearly dependent for the values 
of dy;,/dx (i = 1,...,m) associated with the multiplicity. For, if this 
were not so, we could regard dy; (j = 1,...,m), dx as constants and solve 
(5) and (6) for certain of the partial derivatives of the third order; and, 
this being done, we could certainly find integrals of (1) such that, for 
the given values of the dy,/dx (i = 1,...,m) and the given element of 


™m 


ON ONE-DIMENSIONAL CHARACTERISTICS 213 


contact of the second order, ds,, dt,; (i,j = 1,...,m) assume any arbi- 
trarily chosen set of values. Thus these differentials could not be 
restricted to satisfying any total differential equation independent of 
the integral of (1). 

We therefore require to investigate the condition that the right-hand 
sides of (5) and (6) are not linearly independent forms in the 8;;, t;;, 
= 1,..., #8). 

Noticing that = ti, = ete., we require to arrange the 
distinct derivatives of the third order in a definite order. We write 
the pairs of numbers i, j in the sequence 


(1,1), (1, 2),..., (1, m); (2, 2), (2, 3),..., (2, m); ...3 (m, m): 
and the triplets (i,j, 4) in the sequence 
(1, 1,1), (1,1, 2),..., (1, 1, m); (1, 2, 2), (1, 2, 3),..., (1, 2, m); ...5 
(2, 2, 2),..., (2, 2, m); ...; (m, m, m). 
We now require three lemmas 


(A) In the first sequence, the m pairs containing a selected number i 
occur in ascending order of the second number. This is obvious if i = 1. 
If i > 1, we simply observe that these pairs occur in the order (1, ¢),..., 
(i—1,¢), (i,m). 

(B) In the second sequence, the triplets containing any selected pair 
of numbers j and k (j < &) occur in ascending order of the remaining 
number. For there are three possibilities: i <j < k (occurring only if 
<r <k (oceurring only if j < andj <k <s. But (i,j,k) 
precedes (j,r,k) which precedes (j,k,s), since i <j and r < k; and 
the result follows immediately. 

(C) In the sequence of triplets, those containing one selected number 
i occur in the order of the sequence of pairs, applied to the other two 
numbers. The possibilities are r< s <i, u< i < v (these occurring 
only if i > 1), andi <j <k. We require to show that the triplets 
(r, 8,7), (u,i,v), (i,j, oceur in the same order as the pairs (r,s), (u,v), 
(j,k). Clearly the set of triplets (r,s,i) occur in the same order as that 
in which the pairs (r,s) occur in the sequence of pairs; and the same is 
true of the (u,i,v) and (u,v), and of the (i,j,4) and (j,k). If r <u, 
(r,s,i) oceurs before (u,i,v), and (r,s) before (u,v); and vice versa if | 
r>u. If r =u, (u,s,i) oceurs before (u,i,v), and (u,s) before (u,v), 
since s <<i<v. Then (r,s,i) and (u,i,v) occur before (i,j,k), while 
(r,s) and (u,v) oceur before (j,k), since r <i <j, u <i <j; and the 
result is thus established. 


| 


214 D. H. PARSONS 


Turning to the equations (5) and (6), we first show that, on a charac- 
teristic, we cannot have dx = 0. For suppose that dx = 0. Then (5) 
and (6) become 


= 2 (6 = 1.0m), (7) 
ds; = (i = m). (8) 


Now, if we arrange the equations (7) in the order of the ‘seyuence of 
pairs’ above and the terms on the right in the order of the ‘sequence 
of triplets’, we see that the matrix of the coefficients of the ¢,;, on the 
right is a matrix with $m(m-+-1) rows, and 4m(m-+-1)(m+-2) columns. 
By (B) above, each row consists of the elements dy,, dy,,..., dy,,, in 
that order, but interspersed with zeros; while, by (C) above, in each row 
the element dy, occurs at least one place further to the right than it 
does in the preceding row. This matrix is certainly of rank $m(m-+-1), 
unless dy, = ... = dy,, = 0. For, if not, if dy, be the non-zero element 
with highest suffix, we can select a determinant of }m(m-+1) rows, 
having dy, in the leading diagonal and zero above the leading diagonal, 
which is non-zero. 

Similarly, we show that the matrix of the s,; on the right of (8) is 


of rank m unless dy, = ... = dy,, = 0. Thus the right-hand sides of 
(7) and (8) are linearly independent forms in the derivatives of the third 
order unless dy, = ... = dy,, = 0, which is inadmissible if dx = 0. 


We must therefore have dx + 0; and to simplify the calculations, we 
consider a different system, equivalent to (5) and (6). Fixing the index 
i, we may remove from (5) the restriction i < j, which was imposed 
only to avoid duplication. Then, multiplying (5) by —(dy,—S,dz), 
summing on the index j, multiplying (6) by dz, and adding, we obtain 


ds, dx (dy;—S, da) dt,;+ (dF /dy,) dx? 
(dy —S, dy, da+T,,da*)t,,,— 
(2dy, dy, —S, dy, da—S, dy, dx+-T,, dx*)t;;, 


(i = 1,...,m) (9) 


and clearly the system (3), (5), (6) is entirely equivalent to the system 
(3), (5), (9), when dx ¥ 0, once again with the restriction i < j in (5), 
to avoid duplication. 

Since each distinct s,; occurs in one only of the equations (5), when 
dx # 0, we may solve the equations (5) for the }m(m-+-1) derivatives 


ON ONE-DIMENSIONAL CHARACTERISTICS 215) 


8,;; and thus we need investigate only the conditions for the right-hand 
sides of (9) to be linearly dependent forms in the ¢,,. Let us arrange 
the equations (9) in the order i = 1,..., m and arrange the terms on the 
right, according to the suffixes of the ¢,;,, in the order of the ‘sequence 
of triplets’ above. Then the matrix of the coefficients of the t,,, on 
the right, is a matrix of m rows, and 4m(m-+1)(m+2) columns. We see 
that every row consists of the elements a,;,, where 

— ay, = 

= (j < k), 
interspersed with zeros; by (C) above, we see that, in every row, these 
elements occur in the same order: that of the ‘sequence of pairs’; and, 
by (B) above, we see that, in each row, a selected element occurs at 
least one place farther to the right than in the preceding row. 

Once again, this matrix is certainly of rank m unless all its elements 
are zero. For, if not, selecting the non-zero element farthest to the 
right in the first row, we could pick out a determinant, with this 
element in the leading diagonal and zero above the leading diagonal, 
which would be non-zero. 

Thus, on a characteristic, we must have each a;; = ay, = 9, i.e. 

dy (j =1.,...,m), 
2dy; dy,,—S, (j #k;j,k=1,...,m). (11) 
Let j4;,, 4j2 be the roots of (10), regarded as a quadratic in dy,/dx, so 


= S; (j = 1,..., (12) 
= 
The equations (10) and (11) then become 
= 0 (j = 1,...,m), (13) 


(dy; — prj, dx) ppg dx) + (dy; — py dx) — + 

+H{T — da? = 0 (7 j,k = 1,...,m). (14) 
Let us suppose first that the system (13) and (14) admits a consistent 
solution in dy,/dx (j = 1,...,m). Then, by suitable labelling of the roots 
of (13), we can ensure that this solution is 


dy;—pydx=0 (j = 1,...,m). (15) 
Substituting in (14), we have the necessary condition for consistency 
Te = < (16) 


If this condition be satisfied, the equations (14) become 
(dyj— py dx) + (dy; — py, dx) = 0 <k). (17) 


216 D. H. PARSONS 


Suppose first that not every = for example suppose py, 1». 
Then, putting dy, = »,,dx in (17) with j = 1, we see that we must 
have (15) as before. Again, putting dy, = y,.dz in (17) with j = 1, 
we obtain dy, = 0 (k = 1,...,m); (18) 
and thus we have precisely two distinct sets of consistent solutions, of 
(13) and (17), namely (15) and (18). If pj, = pj (J = 1,...,m), it is 
clear that (13) and (17) admit one set of consistent solutions only, i.e. 
the two sets are confluent. 

Each of the sets of solutions (15) and (18) of the equations (13) and 
(17) which, when the conditions (16) are satisfied, are equivalent to 
(10) and (11), leads to a system of characteristics of the equation (1). 
For, substituting for dy,,..., dy,, from (15) in (9), taking account of (10) 
and (11), using (12) to express the values of the S; (j = 1,...,m), and 
dividing by dx, we obtain (remembering that t,; = t;;) 


And these, in view of the definition of (dF /dy;), are total differential 
equations in the variables making up the element of contact of the 
second order, containing ds,,..., ds,,, and independent of the integral 
on which we suppose the multiplicity to lie, as required by the definition. 
Substituting also from (15) in (3), we see that, when the conditions (16) 
are satisfied, we have two systems of characteristics, distinct unless 
every pj—pHjo = 0 (j = 1,...,m), confluent in the contrary case, one 
system being defined by the equations 


dy;—pjydx = 0 (j = 1,...,m) 
dz—(p+ = @ 
dp+(F—S uns,) = 0 (20) 


(s+ de =O (i =1....,m) 
2 (dF /dy,)dz =0 (i= 1.,...,m) | 


and the other by permuting p,, and pj (j = 1.,...,m) in (20), pj, and 
#432 being defined by (12). It is easy to see, exactly as in the classical 
case of two independent variables, that every integral of (1) is a locus 
of characteristics of both systems. 


It remains to find the necessary and sufficient conditions, in terms 


ON ONE-DIMENSIONAL CHARACTERISTICS 217 


of the given equation, for the equations (16) to be satisfied. Consider 
the symmetric matrix 


8, 


whose rank defines the rank [(5) 112] of the equation (1). Putting in 
the values (12) for S; and 7); (j = 1,...,m), and the values (16) for Ti. 
(j) # k), we see that the matrix M is then equal to the matrix product 


Pu Faz 4 
Paz + + 
x 
The rank of M cannot, therefore, exceed two; and, by considering the 


minors 


i M), 


we see that, if not every u;,—;. is zero, so that there are two distinct 
systems of characteristics of the second order, the rank of M (i.e. the 
rank of (1)) is 2. If pj—py = 0 (¢ = 1,...,m), so that there is one 
system of characteristics only, the two matrices written above, of which 
M is the product, have respectively two identical rows only, and two 
identical columns only. Each is therefore of rank 1; and M, which 
has a non-zero element, is therefore of rank 1. 

Thus we see that in order that the equation (1) may admit two 
distinct systems of characteristics of the second order, it is necessary 
that it be of rank 2; and in order that it may admit one system oniy, 
it is necessary that it be of rank 1. We shall now show that these 
conditions are also sufficient. 

Suppose that the matrix M above is of rank not exceeding 2. We 
may still define jj, pj. (j = 1,...,m) to be the roots of the quadratic — 
equations (10) in the dy,/dx (j = 1,...,m), so that S; and 7); (j = 1,...,m) 
are still expressed by (12). Then, expressing the fact that, by the 
hypothesis regarding M, the minor formed from the first, (j+-1)th and 


. . . | 
lm T, 


218 D. H. PARSONS 


(k+1)th (j 4%) rows and columns of M is zero, and, using (12), we 
have 


= 0, (21) 
Suppose, firstly, that at least one of the equations (10) has distinct 
roots. By suitable labelling, we may suppose that it is the first, so 
that 41,—,. # 0. Then, by interchanging where necessary the labels 
of any other pair of roots p,, and pz. (k > 1), we can ensure that each 
condition (21) in which 7 = 1 becomes 
Tie = 2,...,m). (22 
Let us now equate to zero the determinant formed from the first, 
second, and (j+1)th rows (j > 1) and the first, second, and (k+-1)th 
columns (k > 1; k #j) of M. We then have, using (12) and (22), 


=—} (Hy {Tj — Meet Hye 
= 0. (23) 


Thus, since ,;—44. # 0, we once again have, from (22) and (23), the 
conditions (16); and we have seen that, in these circumstances, M is 
of rank 2 and there are two distinct systems of characteristics of the 
second order. 

Secondly, suppose that 


= 9 (j = 1,...,m). 
Then the condition (21) gives 


Ty. = #4); 


and thus, once again, we have the conditions (16) when we write 


Hija = By (j = 1,...,m). In this case, we have shown that M is of rank 1, 
and that there is one system of characteristics of the second order only. 
We therefore see that in order that the equation may admit two 
distinct systems of characteristics of the second order, it is necessary 
and sufficient that (1) be of rank 2; and in order that it may admit ore 
system only, it is necessary and sufficient that it be of rank 1. 
We have seen that an equation of the second order with m+-1 inde- 


| 


ON ONE-DIMENSIONAL CHARACTERISTICS 219 | 


pendent variables, of any form, can be reduced to the form (1) by a 
simple change of independent variables. But I have shown (loc. cit.) 
that the rank of an equation of the second order is invariant under 
any contact transformation. Thus we may at once extend these results 
to equations of any form. The results for characteristics of higher order 
than the second are entirely similar, and obtained in exactly the same 
way. We may summarize the whole matter in the theorem: 

THEOREM. A partial differential equation of the second order, in one 
dependent and any number of independent variables, admits two distinct 
systems of one-dimensional characteristics, of the kind defined earlier, of 
the second and all higher orders, if it be of rank 2. It admits one system 
only, i.e. the two systems are confluent, if it be of rank 1; while there are 
no characteristics of this kind if it be of rank 3 or more. 


The extension of Darboux’s method to equations of rank 2 or 1 can 
then be made, on exactly the same lines as I have already developed 
it for equations with three independent variables (4). 

There is an interesting link between the theory of one-dimensional 
characteristics, which we have developed, and the standard theory of 
the m-dimensional characteristics (known as Monge-characteristics) of 
an equation with m+ 1 independent variables, developed by Beudon (1). 
It is a standard result that, corresponding to a known integral of (1), 
the condition for the m-dimensional hyper-surface (‘surface’ if m = 2) 


P(x, Ym) = 0 
to be a Monge-characteristic of (1), contained in the integral, is, with 
our notation, 


where we suppose that z and its derivatives, in S, etc., are replaced 
by their values corresponding to the integral of (1). But the condition 
that the matrix M above shall be of rank 2 or 1 is precisely the condition 
for the left-hand side of (24) to be decomposed into factors, i.e. for (24) 
to become 


ma) (+ > = 0, (25) 


on replacing S,, 7; by their values given by (12) and (16). 

Now the curves associated with the Cauchy-characteristics of (24) 
are known as the ‘bi-characteristics’ of (1). Uf the left-hand side of (24) 
be irreducible, these bi-characteristics, corresponding to any integral 


220 ON ONE-DIMENSIONAL CHARACTERISTICS 


of (1), form a complex of curves in the space of m+-1 dimensions, 
depending, in general, upon 2m—1 parameters. But, if (24) is reducible 
and decomposes into (25), the bi-characteristics form two congruences, 
each depending upon m parameters, one curve of each congruence 
passing through each point of the (m+1)-space. Thus we have the 
curious result that, if the bi-characteristics associated with any integral 
of an equation of the second order form a complex of curves, the 
equation does not admit one-dimensional characteristics; but, if the 
bi-characteristics form two congruences (which may be distinct or 
confluent), then two families of one-dimensional characteristics exist. 
Furthermore, we may assert that, in this case, the curves associated 
with these latter coincide with the bi-characteristics. For the Cauchy- 
characteristics of the two linear equations of the first orcer, represented 
by (25), are defined by the equations 


= dy (i 1,..., m), 
Pi 

dz = dy (i = 1,..., m), 
1 Fi2 


in which and pj (i = 1,...,m) are now functions of 2, ¥j,..., Yn 
since z and its derivatives are supposed replaced by their values corre- 
sponding to the integral of (1). Thus we see from (15) and (18) that 
the two families of Cauchy-characteristics of the two first-order equa- 
tions represented by (25) are precisely the two families of curves 
associated with those characteristics, contained in the integral of (1), 
which are defined by (20) above, and by a similar system of equations 
with permuted. 


REFERENCES 


. J. Beudon, Bull. Soc. math. 25 (1897) 108-20. 


. E. Goursat, Equations aux derivées partielles du second ordre (vol. 2) (Paris, 
1898). 


3. Natani, Die héhere Analysis, 388. 
4. D. H. Parsons, D. Phil. Thesis (Oxford, 1952). 
5. Quart. J. of Math. (Oxford) (2) 8 (1957) 112-16. 


Ne 


‘ 
ie 
| 


CONNEXIONS FOR PARALLEL DISTRIBUTIONS 
IN THE LARGE (II) 


By A. G. WALKER (Liverpool) 
[Received 17 August 1957] 


1. Introduction 

In a previous papert it was shown that on a differentiable manifold 
an affine connexion always exists globally with respect to which one 
or more given distributions are parallel. It was also shown that, if the 
given system of distributions is integrable, the connexion can be chosen 
to be symmetric, i.e. torsion-free. 

The present paper continues the study of global connexions related 
to given distributions, and properties such as relative parallelism and 
path-parallelism with respect to a connexion are defined and con- 
sidered.t A number of existence theorems for global connexions are 
given, these connexions being torsion-free whenever possible, and in 
each case a formula for the simplest connexion having the desired 
properties is constructed. All such formulae are expressed in terms of 
the projection tensors associated with the given distributions, and are 
simplified by means of a convenient notation for the various projections 
of a tensor. With this notation the calculations are very similar to 
those resulting from the use of forms and special frames. The present 
method has the advantage, however, that the resulting formulae for 
tensors and connexions are expressed in relation to a general coordinate 
system: no transformation from a special to a general frame is neces- 
sary, and formulae are in a convenient form for subsequent applications 
to special problems. 

In a later paper it will be shown how some of the present results lead 
to a definition of ‘torsional derivation’ and the construction of new 
concomitants for an almost complex structure. Certain holonomic pro- 
perties of some of the connexions given here will also be discussed in 


+ A. G. Walker, Quart. J. of Math. (Oxford) (2) 6 (1955) 301-8. This paper 
will be referred to as (1). 

t Some of these properties are closely related to properties of non-holonomic 
submanifolds studied in the 1930’s by Vranceanu, P. Dienes, and others. This 
earlier work was all local, however, and was not concerned with the present 
problem of establishing existence theorems and constructing global connexions 
having the desired properties. 


Quart. J. Math. Oxford (2), 9 (1958), 221-31. 


222 A. G. WALKER 


another paper, where it will be shown how they are related to the 
foliation groups of Ehresmann. 

It will generally be assumed that the given structure (manifold and 
distributions) is of class C”, and it will then be seen that the con- 
nexions we construct are all of the same class C”. If, however, the 
given structure is analytic and if the manifold is such that there exists 
globally an analytic connexion, then the constructed connexions will 
be analytic. 


2. Projection tensors 

Let M be an n-dimensional manifold and D’, D” two complementary 
distributions over M, of dimensions r’, r”, where r’+r” =n. At any 
point xe M the r’-plane and r’-plane belonging to D’ and D” are 
denoted by D’, and D’; these are sub-spaces of the tangent plane 7, 
and are complementary in the sense that they are disjoint and their 
sum is 

A vector u € 7, decomposes to give u = u'+wu", where w’ € D, and 
u” € D’.. The projection tensors at x associated with D’, D” are the 
endomorphisms of given by 


du =u’, Gu =u". 


They are mixed second-order tensors of ranks r’, r” and satisfy the 
usual identities 


@=a di=td=0, (1) 


These tensors are defined at every point of M and form tensor fields 
of the same class as the given structure (M, D’, D”). 

The vectors u’ = du and u” = du are the projections of the contra- 
variant vector u. We can also define the projections of a covariant 
vector and of any tensor, but this is done most simply in terms of their 
components relative to a general local coordinate system. If dj and di} 
are the components of d and 4, then for a contravariant vector (u') 
the two projections are dj, uw” and di,u”, and for a covariant vector (v;) 
there are two projections dv, and G?v,. For a more general tensor 
there are two projections for each suffix, so that the tensor may be 
partly or wholly projected in a number of ways. Every projection is 
a tensor of the same kind, and a convenient notation is to attach one 


or two primes to a suffix to denote projection by a or d. We should 
thus write 


vy =aPv,, vy = GPr,, 


ON CONNEXIONS IN THE LARGE 223 


and for a tensor 7'j/: two of the projections are 
Ti =a, TP, = aaj ay THe. 
Because of the identity d+da = J, there is a relation between the pro- 
jections for each suffix, which can be written symbolically i’+-i” = i, 
Thus 
= ul, veto, =v, TY+TY = TH, ete. 
The summation convention will operate regardless of primes: thus 

since d? = d. Also, because dd = dd = 0, we have 

= 0, vu = 0. 

Suppose now that L is any connexion. Denoting covariant differen- 
tiation with respect to L by a solidus, we have 8), = 0, and, from 
aj, = 0. (2) 
Also, from aa = 0 and (2), 

With our projection notation it is convenient to adopt the convention 
of projecting after differentiating, so that, for example, dj,,4? will be 
written dj-,. Then (3) gives 


(4) 
and from dad = 0 it can similarly be deduced that 
Gh, = Gjy. (5) 
I shall now write 
aj,(L) = (6) 


and observe that a‘,(L) is unaltered when we interchange D’ and D’, 
i.e. d@ and G. From (2), (4), and (5) we find 
aj,(L) = = 
ai,(L) = aj,(L) = }- (7) 
aj,(L) = aj-,(L) = 0 
If T is another connexion and L = ['+X, so that X = (Xj,) is a 
tensor, a simple calculation gives the relation 
aj,(L) = (8) 


3. Integrability and parallelism . 
Some of the expressions occurring in (1) can now be written more 
concisely. For example, the condition for D’ to be integrable ist 


= 9, (9) 
+ As usual, 7;,,, is written for and Ty for HT y4+T,)- 


224 A. G. WALKER 


where I is any symmetric connexion, this being obtained directly from 
the fact that the system of partial differential equations 4? 2@, f = 0 
are required to be completely integrable. 

The condition for D’ to be parallel with respect to a connexion L 
can now be written ai,(L) = 0. (10) 


This can be obtained immediately as follows. If 5 denotes the absolute 
differential corresponding to dz‘ and if (u') is a vector which lies in D’ 
while undergoing parallel displacement, then du‘ = 0 and 4(a/,w”) = 0. 
Hence (8é!,)u? = 0 for all dx‘ and all vectors (u‘) in D’, i.e. a}, = 9, 
and (10) follows from (2) and (7). 

Similarly the condition for D” to be parallel with respect to L is 
a\.,(L) = 0, and combining this with (10) we see that the condition 
for both D’ and D” to be parallel with respect to L is 

aj,(L) = 0. (11) 

In (1) it was proved that there is a global connexion L which satisfies 
(10) and is symmetric when D’ is integrable. It was also proved that 
there is a global connexion which satisfies (11) and is symmetric when 
both D’ and D” are integrable. To obtain expressions for these con- 
nexions we first choose a global symmetric connexion [ of class C” (or 
C» if the given structure is analytic and if it is known that an analytic 
connexion exists); it is well known that this can always be done on a 
manifold of class C”. Then, if we write L = [+7 and use (8), equations 
(10) for L become equations for 7’, 


= 4}, (0). 
For simplicity we choose T%,, = 0, Ti, = 0, and, for symmetry, 
Ti, = Thy = 

Then Ti, = (12) 
We now see from (9) that 7), = Tj; when D’ is integrable. Thus 
L = T+T, where T is given by (12), is a connexion which satisfies (10) 
and is symmetric when D’ is integrable. This is the connexion given 
in (1) though expressed differently. 

By a similar method it can be shown that the connexion L = +S, 


h 
Si, = (13) 


satisfies (11) and is symmetric when both D’ and D” are integrable. 
This again is equivalent to a connexion given in (1). 
We observe that the connexions +7 and [+S have class C”, and 


| 


ON CONNEXIONS IN THE LARGE 225 


are analytic if the given structures and [ are analytic. These con- 
nexions are defined globally because the connexion [ and the tensors 
on the right in (12) and (13) are global. Similar results about class and 
global character will hold for other connexions constructed in this paper 
but will not be mentioned explicitly. 

The torsion tensor associated with the connexion L is Hi, = Li, 
so that for L = [+S we have the torsion tensor Hj, = Sj,,, since T is 
symmetric. Hence from (13) we find 

It is easily verified that this tensor is in fact independent of the sym- 
metric connexion [ and is equivalent, to within a numerical factor, 
to the torsion defined by Nijenhuist for an almost complex structure. 
Such a structure is determined by a tensor h satisfying h? = —J, and 
the relation between our d@ and this h is d = }(1+ih). The torsion as 
defined here appears naturally as the torsion of certain connexions 
associated with D’ and D” and is not merely defined as a tensor which 
vanishes when the given structure is integrable. It has, of course, the 
latter property, for the two terms on the right in (14) vanish when D’ 
and D” respectively are integrable. Conversely it follows from (7) that 


= = Hh, 
so that D’ and D” are integrable when H',, = 


4. Relative parallelism 

An integral curve of a distribution D’ is a differentiable curve in M 
with the property that the tangent vector at any point x of the curve 
lies in D’. If D’ is integrable, the planes of D’ are tangent to a system 
of submanifolds, or laminations (foliations), and M is said to have a 
laminated (foliated) structure. In this case the integral curves of D’ 
are the differentiable curves lying in the laminations. 

I shall say that a vector field uw is parallel relative to D’ (with respect 
to a connexion L) if, for any points x, y on an integral curve of D’, the 
vectors u,, u, are parallel relative to this curve.{ If D’ is integrable 
and x, y are any two points on one of the laminations determined by 
D’, then u,, u, are parallel relative to any differentiable are xy lying 
in the lamination. The fact that these vectors need not be parallel 
relative to an are zy that does not lie in the lamination shows that 

+ Proc. Kon. Nederlandse Ak. (A), 58 (1955) 390-403, § 3. 


t When referring to parallel vectors along a curve we find it convenient to 
deseribe the vectors as parallel relative to the curve, and with respect to the 


connexion. 
3695,2.9 Q 


3 


226 A. G. WALKER 


relative parallelism is weaker than parallelism. It is clear that a parallel 
vector field is parallel relative to every distribution. 

In terms of local coordinates and the projection tensors associated 
with D’ and a complementary distribution D”, the vector field (u‘) 
is parallel relative to D’ if 5u' = 0 for all dx‘ lying in D’. Since 
du! = uj,dx’, this condition becomes 

ul, = 0. (15) 

Relative parallelism can be extended from a vector field to any 
distribution, and I shall say that a distribution D* is parallel relative 
to D’ (with respect to L) if the planes of D* at points of any integral 
curve of D’ are parallel relative to this curve. If (bj) is a projection 
tensor such that the vectors of D* are given by bju/ = u', we require 
5(b} w/) = 0 when du‘ = 0 for all u‘ in D* and all dx‘ in D’. Hence 
bi,,u'dx* = 0 for all wu! in D* and dz in D’, i.e. 

bi, bP at = 0. (16) 
This, then, is the condition to be satisfied by (b') for the associated 
distribution D* to be parallel relative to D’ with respect to L. 


5. Special relations 

In this section we consider various ways in which a connexion can 
be related to a given distribution D’ or complementary pair D’, D’. 
If only D’ is given, then D” is taken to be any complementary distri- 
bution. 


(I) I shall say that D’ is semi-parallel with respect to L if it is parallel 
relative to itself. The condition for this is given at once by (16) with 
a in place of b, and is therefore equivalent to 

a\,(L) = 0. (17) 
Clearly, D’ is semi-parallel if it is parallel. Also, D’ is integrable if it 
is semi-parallel with respect to a symmetric connexion. 

(II) We may have D” parallel relative to D’. The condition for this 
is given by (16) with @ in place of 6 and is therefore equivalent to 

ai..(L) = 0, (18) 
This does not imply any integrability conditions when L is symmetric. 
(III) Interchanging D’ and D” in (II), we see that D’ is parallel 


relative to D” if aj,,.(L) = 0. This can be combined with (18), and D’, 
D” are each parallel relative to the other if 


ajnAL)=0, abyAL) = 0. (19) 


a 


ON CONNEXIONS IN THE LARGE 227 


As will be seen from Theorem 2 of § 6, these again do not imply any 
integrability condition when L is symmetric. 


(IV) We may have D’ parallel and at the same time D” parallel 
relative to D’. The conditions for this are (10) and (18), and combining 
these we see that they can be written 


ajy(L)=0, = 0. (20) 
These, of course, require D’ to be integrable if L is symmetric; D” need 
not, however, be integrable. 
(V) Another property a distribution D’ may have with respect to 
a connexion is that of being path-parallel. A given connexion L deter- 
mines a system of auto-parallels, or paths (geodesics in the case of a 
Riemannian manifold), such that for any point x and tangent vector u 
at x, there is just one path through z tangent to u. We say that D’ is 
path-parallel with respect to L if, for every point x in M and vector 
u in D’,, the path determined by x and uw is an integral curve of D’. 
Writing u‘ = dx‘/dt for the tangent vector to a path, we have 5u' = 0, 


and wo require = di, wut dt 0 
for all vectors u‘ in D’. Hence dj,u/ u* = 0 for all u‘, ie. from (7), 
ai. (L) = 0. (21) 


This, then, is the condition for D’ to be path-parallel with respect to L. 

From (17) we see that, if D’ is semi-parallel (with respect to L), then 
it is path-parallel. Path-parallelism is weaker than semi-parallelism, 
because, for example, D’ can be path-parallel with respect to a sym- 
metric connexion without being integrable. If, however, D’ is both’ 
integrable and path-parallel with respect to L, then D’ is also semi- 
parallel with respect to the symmetric part of L. To see this we write 
L =T-+X, where [ is symmetric and X is skew-symmetric. Then, 


from (8), ai,(L) ai, 
and (21) becomes aj,,)(C) = 0. Since D’ is given integrable, 
= 0 and hence = 0, 

i.e. D’ is semi-parallel with respect to I. 

We can further prove the lemma: 

Lemma. If L is symmetric and if D’ is integrable, path-parallel, and 
parallel relative to D” with respect to L, then D’ is parallel. 

Since L is symmetric and D’ is integrable, we have aj,,,(LZ) = 0, and 
hence from (21), since D’ is path-parallel, aj,(L) = 0. For D’ to be 


228 A. G. WALKER 


parallel relative to D” we have (18) with D’ and D” interchanged, i.e. 
ai,,.(L) = 0, and combining this with aj,(L) = 0 we have aj.,(L) = 0. 
This is the condition for D’ to be parallel, which proves the lemma. 


6. Special connexions 

We are now in a position to construct various connexions related to 
a given distribution D’ or complementary pair D’, D’. Choosing a 
global symmetric connexion T as before, we find the conditions to be 
satisfied by the tensor L—T in order that L should have the desired 
property. The conditions turn out to be algebraic, and in each case 
we find the solution which is simplest in relation to the chosen I. 

For the remainder of this paper I shall write a}, for aj,(T). 

THEOREM |. For any complementary distributions D’, D” there is a 
global symmetric connexion with respect to which both distributions are 
path-parallel. 

From (21) and (8), D’ and D” are both path-parallel with respect to 
L =T-+A if A satisfies the algebraic equations 

= Urs Arey = Uy (22) 
In addition to these restrictions we want A to be symmetric, and the 
simplest solution is therefore 
Aj, = t+ (23) 
The connexion [-+A, where A is given by (23), satisfies the require- 
ments of the theorem, which is therefore proved. 

This theorem can be generalized to any system of disjoint distribu- 
tions, where every sum of distributions of the system is required to be 
path-parallel. Here again it is not difficult to prove that a global 
symmetric connexion exists as required. 

It will now be shown that further conditions can be imposed on the 
connexion in Theorem 1. 

THEOREM 2. For any complementary distribution D’, D” there is a 
global symmetric connexion with respect to which each distribution is path- 
parallel and parallel relative to the other. 

We now want L to be symmetric and satisfy (19) in addition to the 
previous conditions. Writing L = [+B, we require B to be sym- 
metric and to satisfy (22) (with B in place of A) and also, from (8) 


an l ; 
d (19), Boye = Bye: = 


The simplest solution is found by taking By, = Bj. = 0, and can be 


written 


} 
| 
i 


ON CONNEXIONS IN THE LARGE 229 


It can be verified that the connexion [+ B with B given by (24) satis- 
fies the requirements of the theorem. 

We observe that PB in (24) is the symmetric part of S in (13), ie. 
Bi, = Sig), and therefore B = S when D’ and D" are both integrable 
since S is then symmetric. 

If we relax the requirements that D’ and D” should be path-parallel 
and seek a symmetric connexion with respect to which D’ and D” are 
parallel relative to each other, the above solution [+B is not the 
simplest, which is easily seen to be [+ C, where 

Che = + (25) 

Returning to the connexion L = [+ B constructed to prove Theo- 
rem 2, it is notable as being probably the simplest global connexion 
(related to a chosen T) which is symmetric and is associated with any 
two complementary distributions, without assuming any particular 
properties such as integrability. With this connexion L we have from 
(19) and (21), 

= (L) = = = 0, (26) 
and it follows that aj,(L) is skew-symmetric. This tensor is in fact the 
torsion tensor defined in § 3, for, since the present L is symmetric, we 
have (14) with ZL in place of [ and from (26) we find 

Hi, = ai,(L). (27) 
This simple relation is true for any connexion L which satisfies the 
conditions of Theorem 2. 

If, in Theorem 2, D’ is integrable, then by the lemma of § 5, D’ is 
parallel. Thus for any integrable D’ and complementary D” there is 
a symmetric connexion with respect to which D’ is parallel and D” is 
path-parallel and parallel relative to D’. This is contained in the 
following theorem. 

TuHeoreM 3. For any complementary distributions D', D" there is a 
global connexion with respect to which D’ is parallel and D” is path-parallel 
and parallel relative to D’; this connexion can be chosen formally so that 
it becomes symmetric when D’ is integrable. 

To prove this we construct a connexion L which satisfies (10) and 
(18), together with (21) with D’ and D” interchanged, i.e. aj;-4-)(L) = 9. 
Choosing a symmetric I as before and writing L = [+V, we get the 
equations for V as 

Vin = View = = 
We also want V to be symmetric when D’ is integrable, i.e. when 


| 
? 


230 A. G. WALKER 


aj, = 0. For the simplest solution we therefore take Vj, = Vj-,.. = 0 


and find i 
Vix = 2A — — (28) 


With this V the connexion ['+-V is found to satisfy all the requirements 
of Theorem 3. 

As expected, V becomes B in (24) when D’ is integrable. We also 
observe that, if D’ and D” are both integrable, then, by the lemma of 
§ 5, D” is parallel with respect to the symmetric connexion [+V. In 
this case V and S in (13) become the same tensor. 

It should be noted that, if in Theorem 3 we exclude the last require- 
ment, then [+-V is not the simplest solution. The simpler connexion 


Li, = (29) 
makes both D’ and D” parallel but does not necessarily become sym- 
metric when D’ and D” are integrable. Here [ is any symmetric con- 


nexion, but in the next section the same connexion L appears with T° 
as the Christoffel connexion given by a certain metric. 


7. Distributions and metrics 

Distributions D’, D” are orthogonal with respect to a (Riemannian) 
metric g if at every point x, g;;u‘v' = 0 for all vectors u in D’ and v in 
D", the components of g, u, v being relative to some coordinate system 
in the neighbourhood of x. In projection notation this condition for 
orthogonality is equivalent to 

= 0 (30) 

when D’, D” are complementary distributions. We now observe that 

For any complementary distribution D’, D" there is a positive-definite 
metric g with respect to which D’, D" are orthogonal. The class of g is C”, 
and is analytic if the given structure is analytic and if M is known to 
admit a positive-definite analytic metric. 

This follows from the fact that M admits a positive-definite metric, 
h say, of class C”, The required metric is now given by 


= hey they 
in every coordinate neighbourhood; this symmetric tensor is defined 
globally, and g is at once seen to be positive-definite. If M, D’, D”, and 
h are analytic, then g is analytic. 
THEOREM 4. For any complementary distributions D’, D’ which are 
orthogonal with respect to a metric g, there is a global connexion L with 
respect to which D’ and D” are parallel and g is constant, i.e. Jijn = 9. 


e write L = [+ W, where [ is now chosen to be the Christoffel 


ON CONNEXIONS IN THE LARGE 231 


connexion given by g, so that g;;, = 0, a comma denoting covariant 
differentiation with respect to [. We write 
Gi; = Jip Gi; = Jip 4}, aj, = a},{T), 
and establish the identities 
Jip +Gjp Uk = 9. (32) 
To prove (31), we have g;,-;, = 0 from (30) and hence 
Similarly, 4;; = d,,. To prove (32) we use the identities (2), (4), (5), 
and (7) with [ in place of Z. We have 
= +Giryp from (30), 
= since = 9, 
= from (31), 
= = —Ijp Uke 
The equations for W are given by ai,(L) = 0 from (11) and g;;,, = 0, 
and, from (8) and g;;, = 0, these become 
= ae 
Wie Wi = 9 
From (7) and (32) we see at once that 
Wi, = = a}, 
is a solution of these equations, and the theorem is proved. 
The connexion [+ W is determined uniquely by D’, D’, and g and 
is the simplest having the desired properties. It is not however the 
most general connexion having these properties since it can easily be 
verified that aj, is not the only solution of equations (33). This means 
that further geometrical requirements could be imposed on the con- 
nexion in Theorem 4. One condition we cannot, however, impose in 
general is that of symmetry, for, if Z is symmetric, then g,, = 0 
implies that L = [ and hence, from (11), aj, = 0. If and only if this 
condition is satisfied is there a symmetric connexion satisfying the 
requirements of Theorem 4. 


(33) 


| 
? 


A MULTIVARIATE GENERALIZATION OF 
TCHEBICHEV’S INEQUALITY 


By P. WHITTLE 
(Applied Mathematics Laboratory, D.S.I.R., New Zealand) 


[Received 10 September 1957] 


1. A CONSIDERABLE literature has been built upon the simple and basic 
Tchebichev inequality, most of it concerned with the strengthening of 
the result when additional information on the variate distribution makes 
this possible. [For reviews of the literature see Shohat and Tamarkin 
(5), Fréchet (2) 130.] A few writers have considered the extension of 
the result to the multivariate case (potentially useful for theoretical 
work, although not for practical statistics). 

In its conventional form Tchebichev’s proposition is that, if a statis- 
tical variate x has zero mean and variance v and 


P = prob(\x| > «), (1) 
then P < va-*, (2) 
If the statement is modified to 
P < {va-}, (3) 
_fy ®<y<)), 
where (4) 


then it becomes the strongest assertion possible in the absence of any 
further information on the distribution of 2. This follows from the fact 
that distribution functions can always be found for which the equality 
sign in (3) is fulfilled. (For v < a? consider the distribution whose mass 


is concentrated at the points —a—e, 0, a+, where « is an arbitrarily’ 


small positive number, in amounts }va-*, 1—va-*, }va-*; for v > a? 
consider the distribution with masses } at = + vv.) 

In this paper the inequality is generalized to the case of n variates 
Lq,..., With zero means and known second moments Where P 
is now defined by 


1—P = prob((|x;| < aj; j = 1,2,...,n). (5) 
The obvious extension of (3), 
> (6) 


Quart. J. Math. Oxford (2), 9 (1958), 232-40. 


| 

| 


ON TCHEBICHEV’S INEQUALITY 233 


is actually the best possible if the variates are uncorrelated, but may 
be very inefficient if there is considerable correlation between variates. 

The case n = 2 has been fully treated by Berge (1) and Lal (3), 
whose methods are in part due to Pearson (4). Lal shows that 


Pe + aj)? — af “il (7) 
al 
Berge had previously obtained this result for the special case 

= 
Lal gives a result for general n, but his inequality is the sharpest possible 
only for n = 2, when it reduces to (7). 

It is true in the multivariate as in the univariate case that substan- 
tially sharper results can be obtained if more is known of the distribu- 
tion than just the first and second moments. For instance, if the x; 
were known to be independent, then (6) could be strengthened to 


P <1— [I (8) 


which is a great improvement in that the dependence upon n is ex- 
ponential rather than linear. However, generalization on the basis of 
the first and second moments alone is a necessary first step in a multi- 
variate theory. 

Notation. I shall abbreviate the terms ‘positive definite’ and ‘non- 
negative definite’ to ‘p.d.’ and ‘n.n.d.’ respectively, and it is to be 
understood that matrices designated by either of these terms are 
symmetric. 


2. The lemmas of this section are not essential for an understanding 
of the main argument and results, and the reader who wishes can pro- 
ceed directly to § 3. 

Lemma 1. If r, 8 are positive numbers such that r+s = 1, and B,, By 
are p.d. matrices, then 

M = (9) 
is n.n.d., and equals zero only if B, = B,. In this sense B- is a convex 
matrix function of the elements of B if B is p.d. 

The matrices 

R= By}, S= By}, T = (rS+sR) (10) 


234 P. WHITTLE 
are all p.d. We have 
M = 
= R— RTS} 
= 
= rs[ RTR+STS—RTS—STR] 
= rs(R—S)T(R—S), (11) 
which is plainly n.n.d. If M were zero, then ¢'(R—S)T(R—S)£ would 
be zero for any vector €; since T is p.d., this implies (R—S)€ = 0 for 
all £, whence R = S and B, = B,. 
Lemma 2. If V and B are arbitrary p.d. matrices, then 
= tr (12) 
is a convex function of the elements of B. 


We have, for arbitrary, unequal p.d. B, and B, 


rll = tr(VM), (13) 
where M is the n.n.d. matrix defined by (9). If M has spectral repre- 
where the .; are non-negative and not all zero, then 

tr(VM) = ¥ Vn;) > 9, (15) 


which proves the assertion. 


Lemma 3. If V is p.d. then 
(i) there is a unique B which minimizes Il = tr(V B-") subject to the 
conditions that B be p.d. and have prescribed diagonal elements; 
(ii) this B satisfies V = BAB, (16) 
where A is a diagonal matrix of positive elements d,;; 
(iii) the minimized value of Tl is 
min I] = ¥ (17) 
(iv) it follows that V has a unique representation of form (16), B, A 
having the properties described in (i), (ii). 


Consider the space of the $n(n— 1) variables b,,,(j > k), and let b denote 
the representative point in this space. Let ! denote the domain in this 
space for which B is n.n.d. for the given values of the diagonal elements 
b;;. It is in the interior of this finite domain, where B is p.d., that 6 
must be chosen. 


| 

| 

7 


ON TCHEBICHEV'S INEQUALITY 235 

If the spectral representation of B is 
B= (18) 
then M(B) = BME; VE;). (19) 


As b approaches the boundary of I’, one or more of the 8; will approach 
zero, and II will become indefinitely large since none of the coefficients 
of the 8;? in (19) is less than the least eigenvalue of V, which is a fixed 
positive number. 

Since I1(B) is finite and continuous inside [ but approaches plus in- 
finity on the boundary, and T is finite, [(B) must reach its minimum 
in the interior of [. The minimum point will be a stationary point 
since the derivatives of Il(B) are also continuous inside [. But there 
is at most one stationary point since, by Lemma 2, I1(B) is convex. It 
follows that there is exactly one stationary point, and that at this 
point assumes its minimum value. 

We now establish (ii) by equating to zero the differential coefficients 
ell @b,, (see §3 for details). The A; are the Lagrangian multipliers 
associated with the prescription of the 6;,, and are positive since 
A = B“VB-' is p.d. 

Then (iii) follows from (16) and the definition of I, while (iv) is an 
immediate consequence of (i), (ii). 


3. We wish to find an upper bound to the probability P that the 
sample point x does not lie in the rectangular interval 


Say (j = 1,2,...,m), (20) 

which we shall denote by C. Let the equation 
S(z) = 1, (21) 
where S(x) = 2’ Ax = ay, 2; (22) 


define an ellipsoid lying entirely inside C. The matrix A must thus 
be p.d., while, if the planes 2; = +a, are not to cut the ellipsoid, we 
must have qi < a?, (23) 


where the a’* are the elements of A-". 
Since all points outside C lie in the region S(x) > 1, we have, if f(x) 
is the frequency function, 
P< f(x) dx < S(a)f(x) dx < E[S(x)] = (24) 
The margin of the inequality will be reduced if the ellipsoid fills C as 


236 P. WHITTLE 


well as possible, i.e. if the planes (20) are tangent planes and equality 
holds in (23). 

It is convenient to set A~! = B and summarize these results as 
follows: 


THEOREM |. [f the variates x,, %9,..., X,, have zero means and covariance 
matrix V, and if 1\—P = prob(\x;| < a;;j = 1,2,...,n), then 


P < {tr(VB>)}, (25) 
where B is any p.d. matrix with diagonal elements given by 


The result could be extended to the case of asymmetric intervals 
B; < x; < a;; the ellipsoid would not then be central. If B is taken as 


purely diagonal, then (25) reduces to (6). In order to sharpen the 


inequality as far as possible, we now minimize with respect to B 
the expression on the right of (25), subject to the conditions imposed. 


THEOREM 2. The best possible inequality based on the first and second 
moments of a distribution with non-singular covariance matrix V is 


P < {tr(VB-)}, (27) 
where B is the unique solution of the equation 
V=BAB (28) 


which fulfils the conditions of ‘Theorem 1 (A being an unprescribed p.d. 
diagonal matrix). This inequality is equivalent to 
P <{DAjaj}. (29) 
By Lemma 3 expression on the right of (25) is minimum at its single 
stationary point. In order to locate the stationary point we set 


jk 


the A; being Lagrangian multipliers. The passage from (30) to (28) is 
clear, and (29) follows from (28). 

In order to prove that the inequality is best-possible, we must show 
that equality of P and {tr(V B-*)} can be attained. 

Now, the coordinates of the point x) where the ellipsoid touches the 
plane x; = a, are 


ai) = Bix (J, 1, n). (31) 


by = 
a?, 
(26) 


ON TCHEBICHEV’S INEQUALITY 237 


If § Aja? = p? < 1, consider the distribution which assigns probability 
mass 1—p* to the origin and 4A, a? to the points + (2”)+-e)(j = 1, 2,...,m), 
« being a vector with arbitrarily smail positive elements. This distribu- 
tion is readily found to have zero mean, covariance matrix BAB=V, 
and by construction fulfils the equality relation in (27). If, on the other 
hand, p > 1, consider the distribution which assigns probability mass 
A; a?/2p? to the points +paY(j = 1, 2,...,). For this distribution, which 
also fulfils the required conditions, P = 1, and the equality of (27) is 
again fulfilled. 


4. The case n = 2 is easily treated explicitly since only the one 
constant b,. = b,, (= 6, say) is available for variation on the right of 
(25). Minimization leads to a quadratic equation in 6 whose smaller 
root must be chosen if B is to be p.d. Substitution of this root in (25) 


yields (7). 
Inequality (7) has the following special cases: 
v v 
OF, == Cas: P< {max (33) 


Case (33) is particularly interesting since the result is correct although 
V is singular. In §§ 2, 3 we excepted the case of singular V, which is 
not easily dealt with by the methods used there. Our results hold for 
V arbitrarily near to singularity, however, and, since it is intuitively 
obvious that the l.u.b. of P is a continuous function of V, the results 
yield valid bounds even for singular V. 

In fact (7) is valid even in the exceptional case when tr(V B-") has 
no stationary point at all, and the optimizing 6 is located on the 
boundary of I (rather, arbitrarily near the boundary). This happens 
when = 04, Vgg and = In this case 
2 43! (34) 


= tr(VB") = 


This expression has no turning-point, but reaches its minimum value 
inside T against the boundary, for 6 > a, 2,8gn(v,,) when 
I] > = 
This limit is exactly the value given by equation (7). 


5. I have not been able to solve the general equation (28) explicitly 
for B, and it is unlikely that the general solution is simple enough to 


is 
€ 
) 


238 P. WHITTLE 


be useful. However, if the bounds are not prescribed in advance, then 
V can be factorized in the form (28) in an infinite number of ways, and 
the diagonal elements of the resulting B will provide a set of numbers a. 
In any particular case it should be possible to find by experiment a 
factorization which yields bounds of approximately the right form for 
the purpose in hand. 
The obvious factorization is 

= (c?Vije*(c? V4), (35) 
where c is a positive scalar and V* is the positive square root of V, i.e. 
the matrix obtained by taking the positive square root of all eigenvalues 
in the spectral representation of V. If the elements of V+ are denoted . 
by vi’, then we have a? = cri), (36) dq 


p< (2. (= (37) 


We obtain a more general factorization by taking any non-singular 
matrix H for which HH’ is diagonal, and setting 


A = HH’, (38) 

B = (39) 

The factorization which is optimal, in that it yields the least P for 

a region C of given content 2"[] a;, is readily found to be that for which 
A; a? is constant. Of course, if one does not require C to be rectangular, 
then the optimal region is an ellipse z’V~-'z = constant. The rectan- 


gular regions have a particular importance, however, since they corre- 
spond to fixed ‘confidence bands’ for the z,. 


6. As an example, let us consider a finite circulant process, i.e. 
assume that 
Vik = (40) 


R; = R,-; (j,k 1, n). (41) 
If the spectral ordinates are defined by 


n 
F, = (42) 
then the spectral representation of the covariance matrix V is 


nt SF, (43) 


n 
so that vf’ = n- 2 Fi. (44) 


| 
| 


ON TCHEBICHEV’S INEQUALITY | 239 
The bounds a; obtained by factorizing V as in (35) are thus independent 
of j. If P denotes the probability that any of the x; fall outside the 
range —a <2 <a, then we see from (37) and (44) that 
P<lal> (45) 
The upper bound of P is thus determined by the ‘square mean root’ 
of the spectral ordinates. Equation (45) has the following special cases: 
R; = jo: P 
R,=v: P< {v/a}. (46) 
Our proofs hold only for a finite set of variates, but it is interesting to 
examine the formal extension of the example to the case of a continuous 
circulant process, i.e. to a continuum of variates x(t) (0 <t < T) of 
zero mean for which 


= R(t,—t,), (47) 
R(s) = R(—s) = R(T—8) (0 <8,t,t, < 7). (48) 
In virtue of its periodicity and evenness R(s) can be represented by 
a Fourier series we 
R(s) = (49) 
where ¢, = ¢_,. Extract now the finite set of variates 
a) (j = 1,2,...,). (50) 
This constitutes a discrete circulant process with spectral ordinate 
=n (51) 


By (45), the probability P”™ that any of xf”,..., fall outside 
—a« <2 < satisfies 


Pm <n} = ( (52) 
As n increases, II will converge to a finite limit if and only if the sum 


¥ 4! converges, in which case the limit is 


One is tempted to conjecture the more general result: that, ifa stationary 
process [2x(t)] has spectral distribution function F(w), then, provided 
that appropriate regularity conditions are fulfilled, a finite bound of 


hen q 

and 

4 

ta 

for 

5) 

e. 

es 

3) 

d ) 

r 


240 ON TCHEBICHEV’S INEQUALITY 


Tchebichev type can be set for the probability that 2(t) lies between 
prescribed finite bounds over a prescribed finite ¢ interval if and only 
if the integral { [ F’(w)]* dw converges. (If F’(w) is infinite for some 
w, then the integral must be suitably interpreted.) The conjecture is 
too academic-to be worth following up, however, since it is possible 
(and in many cases necessary) to improve the Techebichev bound 
considerably by the introduction and exploitation of relatively weak 
additional assumptions concerning the variate distribution function. 

The construction of Tchebichev-type inequalities for random func- 
tions will be considered in a later publication. 

I am indebted to Dr. W. A. Waugh for drawing my attention to 
Lal’s paper. 


REFERENCES 


1. P. O. Berge, ‘A note on a form of Tchebycheff's theorem for two variables’, 
Biometrika 29 (1937) 405-6. 

2. M. Fréchet, Généralités sur les Probabilités (Premier Livre) (Paris, 1950). 

3. D. N. Lal, ‘A note on a form of Tchebycheff’s inequality for two or more 
variables’, Sankhyd 15 (1955) 317-20. 

4. K. Pearson, ‘On generalized Tchebycheff theorems in the mathematical teory 
of statistics’, Biometrika 12 (1918-19) 284-96. 

5. J. A. Shohat and J. D. Tamarkin, The Problem of Moments (American Math. 
Soc., New York, 1943). 


| 
ts 
_t 
{ 
3 
4 
* 
| 


ly 
1e 
is 
le 
d 
k 
) 


Numerical 
Analysis 


D. R. HARTREE, F.R:s. 
Plummer Professor of Maihematical Physics in the 
University of Cambridge 


The purpose of this book is to give an introduction to the 
theory and practice of carrying out numerical calculations 
of various kinds. The main topics considered are finite 
differences, interpolation, quadrature, numerical integration 
of ordinary differential equations, matrices and the solution 
of linear simultaneous equations, non-linear equations, 
functions of two variables and partial differential equations. 
The theoretical treatment is restricted to such aspects as 
provide a basis for or throw light on practical numerical 
methods; the importance of checking is emphasized; and 
there is a chapter on the main tools of numerical work and 
their use. The main changes from the first edition are fuller 
treatments of Gaussian quadrature formula, of solution of 
ordinary differential equations with two-point boundary 
conditions and of partial differential equations; an intro- 
duction to Whittaker’s cardinal function has been added, 
and the treatment of programming for digital computers 
has been considerably shortened in view of other publica- 
tions on the subject. 


Royal 8vo, 318 pages, with 32 text-figures. Second edition 
42s. net 


OXFORD UNIVERSITY PRESS 


